From lmao20001 at gmail.com Sat Mar 1 07:42:55 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Sat, 1 Mar 2014 20:42:55 +0800 Subject: [Numpy-discussion] GSoC 2014 NumPy Message-ID: Hello, I'm a student studying electrical engineering. I am interested in contributing to NumPy and applying GSoC 2014. I have experience of python and C/C++ programming, and I have already seen https://github.com/scipy/scipy/wiki/GSoC-project-ideas. But on that page only the project "Improve Numpy datetime functionality" targets NumPy. Currently I am trying to come up with some ideas about enhancing NumPy. I will be grateful if someone could give me some ideas or guide me as how to do next. BTW, I am looking into the issue list of NumPy on github and trying to make some small bugfixes/enhancement for NumPy. Thanks in advance. Regards: Leo Mao, student in National Taiwan University, Email: lmao20001 at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mohammedfaisal.anees at students.iiit.ac.in Sat Mar 1 09:42:36 2014 From: mohammedfaisal.anees at students.iiit.ac.in (faisal anees) Date: Sat, 1 Mar 2014 20:12:36 +0530 Subject: [Numpy-discussion] GSOC 2014 : "Improve Numpy datetime functionality" Message-ID: Hi , I am Mohammed Faisal Anees , a Computer Science student at IIIT- Hyderabad. I was going though the ideas page and I found "Improve Numpy datetime functionality" really interesting , and it suits my experience as I have a considerable(hopefully !!) amount of experience in C/C++, and Python . Right now I am going through the link suggested by you and the codebase. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Sat Mar 1 10:25:02 2014 From: rays at blue-cove.com (RayS) Date: Sat, 01 Mar 2014 07:25:02 -0800 Subject: [Numpy-discussion] GSoC 2014 NumPy In-Reply-To: References: Message-ID: <201403011524.s21FOwNx011075@blue-cove.com> At 04:42 AM 3/1/2014, you wrote: >Currently I am trying to come up with some ideas about enhancing NumPy. Hello Leo, How about you implement fft.zoom_fft() as a single function? (Not to be confused with chirp-Z) We might be able to lend some ideas, but I've never been satisfied with mine: http://mail.scipy.org/pipermail/numpy-discussion/2007-March/026529.html and Matlab https://www.mathworks.com/matlabcentral/newsreader/view_thread/85241 http://www.numerix-dsp.com/zoomfft.html http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/doc/voicebox/zoomfft.html - Ray S From ndarray at mac.com Sat Mar 1 21:09:37 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 1 Mar 2014 21:09:37 -0500 Subject: [Numpy-discussion] ndarray is not a sequence In-Reply-To: <-5567695652207982944@unknownmsgid> References: <780064138415269619.294722sturla.molden-gmail.com@news.gmane.org> <1393578222.6392.2.camel@sebastian-t440> <-5567695652207982944@unknownmsgid> Message-ID: On Fri, Feb 28, 2014 at 10:34 AM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > > > Whatever happened to duck typing? > http://legacy.python.org/dev/peps/pep-3119/#abcs-vs-duck-typing -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmao20001 at gmail.com Sat Mar 1 23:12:16 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Sun, 2 Mar 2014 12:12:16 +0800 Subject: [Numpy-discussion] GSoC 2014 NumPy Message-ID: Hello Ray, Thanks for your suggestion! I just read the links you provided and I think I can implement it as long as I do further research on zoom fft algorithm. So I wonder if this can be a GSoC project? Maybe I should extend this idea or combine it with other ideas? BTW, just for curiosity, why we need both scipy.linalg and numpy.linalg? Is implementing all functions in numpy a bad idea? Regards, Leo Mao -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Mar 2 06:52:27 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 02 Mar 2014 12:52:27 +0100 Subject: [Numpy-discussion] ndarray is not a sequence In-Reply-To: References: <780064138415269619.294722sturla.molden-gmail.com@news.gmane.org> <1393578222.6392.2.camel@sebastian-t440> Message-ID: <1393761147.8728.0.camel@sebastian-t440> On Fr, 2014-02-28 at 09:10 -0600, Anthony Scopatz wrote: > Thanks All, > > > I am sorry I missed the issue. (I still can't seem to find it, > actually.) I agree that there would be minimal overhead here and I > bet that would be easy to show. I really look forward to seeing this > get in! > Best way to make sure it happens soon is to open a pull request about it ;). - Sebastian > > Be Well > Anthony > > > On Fri, Feb 28, 2014 at 4:59 AM, Robert Kern > wrote: > On Fri, Feb 28, 2014 at 9:03 AM, Sebastian Berg > wrote: > > On Fr, 2014-02-28 at 08:47 +0000, Sturla Molden wrote: > >> Anthony Scopatz wrote: > >> > Hello All, > >> > > >> > The semantics of this seem quite insane to me: > >> > > >> > In [1]: import numpy as np > >> > > >> > In [2]: import collections > >> > > >> > In [4]: isinstance(np.arange(5), collections.Sequence) > Out[4]: False > >> > > >> > In [6]: np.version.full_version > >> > Out[6]: '1.9.0.dev-eb40f65' > >> > > >> > Is there any possibility that ndarray could inherit (in > the last place) > >> > from collections.Sequence? It seems like this would only > be a 1 - 5 line > >> > fix somewhere. I just spent a few hours tracking down a > bug related to > >> > this. Thanks for considering! > >> > >> This should be very easy to do. But what would this give > us, and what would > >> the extra overhead be? collections.Sequence is basically an > abstract base > >> class. If this just slows down ndarray it would be highly > undesirable. Note > >> that ndarray has a very specific use (numerical computing). > If inheriting > >> collections.Sequence has no benefit for numerical computing > it is just > >> wasteful overhead. In this resepect ndarray is very > different for other > >> Python containers in that they have no specific use and > computational > >> performance is not a big issue. > > > > There is no overhead for the array itself. > > > Right, since it's an abstract base class, we don't need to > subclass > from Sequence, just register ndarray with it. > > > The biggest concern is about > > corner cases like 0-d arrays. > > > I think it's reasonable to allow it. The pre-ABC way to check > this > kind of thing also gives a false positive on 0-d arrays, so > we're not > regressing. > > [~] > |1> import operator > > [~] > |2> operator.isSequenceType(np.array(5)) > True > > > That said we probably need to do it anyway > > because the sequence check like that seems standard in > python 3. There > > is an issue about it open on github with some discussion > about this > > issue. > > > https://github.com/numpy/numpy/issues/2776 > > Also, while we're doing this, we should also register the > scalar types > with their appropriate ABCs: > > numbers.Real.register(np.floating) > numbers.Integral.register(np.integer) > numbers.Complex.register(np.complexfloating) > > -- > Robert Kern > > On Fri, Feb 28, 2014 at 9:03 AM, Sebastian Berg > wrote: > > On Fr, 2014-02-28 at 08:47 +0000, Sturla Molden wrote: > >> Anthony Scopatz wrote: > >> > Hello All, > >> > > >> > The semantics of this seem quite insane to me: > >> > > >> > In [1]: import numpy as np > >> > > >> > In [2]: import collections > >> > > >> > In [4]: isinstance(np.arange(5), collections.Sequence) > Out[4]: False > >> > > >> > In [6]: np.version.full_version > >> > Out[6]: '1.9.0.dev-eb40f65' > >> > > >> > Is there any possibility that ndarray could inherit (in > the last place) > >> > from collections.Sequence? It seems like this would only > be a 1 - 5 line > >> > fix somewhere. I just spent a few hours tracking down a > bug related to > >> > this. Thanks for considering! > >> > > >> > >> This should be very easy to do. But what would this give > us, and what would > >> the extra overhead be? collections.Sequence is basically an > abstract base > >> class. If this just slows down ndarray it would be highly > undesirable. Note > >> that ndarray has a very specific use (numerical computing). > If inheriting > >> collections.Sequence has no benefit for numerical computing > it is just > >> wasteful overhead. In this resepect ndarray is very > different for other > >> Python containers in that they have no specific use and > computational > >> performance is not a big issue. > >> > > > > There is no overhead for the array itself. The biggest > concern is about > > corner cases like 0-d arrays. That said we probably need to > do it anyway > > because the sequence check like that seems standard in > python 3. There > > is an issue about it open on github with some discussion > about this > > issue. > > > > - Sebastian > > > > > >> Sturla > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From oscar.j.benjamin at gmail.com Sun Mar 2 13:10:54 2014 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sun, 2 Mar 2014 18:10:54 +0000 Subject: [Numpy-discussion] 1.8.1 release In-Reply-To: References: <1393432035193-36655.post@n7.nabble.com> Message-ID: On 26 February 2014 22:48, Matthew Brett wrote: > > In that case, the OSX instructions could (within the next few months) > be as simple as: > > Install python from binary installer at python.org > curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py > python get-pip.py > pip install scipy-stack > > or similar. Python 3.4 will be released by then and will ship with pip pre-installed: http://legacy.python.org/dev/peps/pep-0453/ So then it could be: 1) Install Python 3.4 from binary installer at python.org 2) pip install scipy-stack. Oscar From jenny.stone125 at gmail.com Sun Mar 2 17:16:57 2014 From: jenny.stone125 at gmail.com (Jennifer stone) Date: Mon, 3 Mar 2014 03:46:57 +0530 Subject: [Numpy-discussion] Suggestions for GSoC Projects In-Reply-To: References: Message-ID: sp.hyp2f1(10,5,-300.5,0.5) >>>>-6.5184949735e+156 The present implementation of the function in scipy, involves one Euler Transform followed by the application of Power Series (as a and b turn negative after Euler Transform is applied once). This most probably blows up the values of function as |c|>>>|a| or |b|. In one of the attempts to debug this the function hyp2f1 (when c<0; a,b>0; |c|>>a,b) was made to call hys2f1 (power series) without Euler Transformation. The result was the same as what mpmath gives when the precision is less (~100), as the error tolerance for hys2f1 is high. (MACHEP of order 10^-17). On increasing the sensitivity by changing the local tolerance to the order of 10^-200, it works perfect. What are the implications of adopting this method in the cases where hyp2f1 fails to give accurate results? Except of course the fact that this implementation would be heavy. > Our current hyp2f1 implementation does use recurrences (hyp2f1ra), but > perhaps they are not invoked for this case. The problem here can be the > accurate determination of the convergence region for each parameter value. > > The straight-forwardness of power series and its resemblance to mpmath tempted me to first try with hys2f1, However I have a strong feeling that owing to strong sensitivity of recurrence function in general, the implementation will be much faster. However direct implementation without a change in sensitivity fails to give the required answer. Regards Jenny > > -- > Pauli Virtanen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Mar 2 21:15:53 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 2 Mar 2014 19:15:53 -0700 Subject: [Numpy-discussion] How security holes happen Message-ID: This is from OS X 9 if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) goto fail; if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail; if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) goto fail; Heh, maybe there is a reason for braces in even the simplest if statements. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From toddrjen at gmail.com Mon Mar 3 07:52:17 2014 From: toddrjen at gmail.com (Todd) Date: Mon, 3 Mar 2014 13:52:17 +0100 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: On Mar 3, 2014 3:16 AM, "Charles R Harris" wrote: > > This is from OS X 9 > > if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) > goto fail; > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) > goto fail; > goto fail; > if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) > goto fail; > > Heh, maybe there is a reason for braces in even the simplest if statements. > > Chuck Not to mention static code analyzers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghisvail at gmail.com Mon Mar 3 09:09:28 2014 From: ghisvail at gmail.com (Ghislain Vaillant) Date: Mon, 3 Mar 2014 14:09:28 +0000 Subject: [Numpy-discussion] proper way to test Numpy version in C/C++ In-Reply-To: References: Message-ID: Would something like: #include "numpy/arrayobject.h" // for compatibility with Numpy version <= 1.6 #if NPY_FEATURE_VERSION < 0x00000007 #define NPY_ARRAY_FARRAY NPY_FARRAY // other defines for deprecated stuff // ... #endif Be robust enough ? 2014-02-28 14:31 GMT+00:00 Ghislain Vaillant : > Hi everyone, > > I have got code for some python wrappers of a scientific library which > needs to support both Numpy 1.6 and later versions. > > The build of the wrapper (using swig) stopped working because of the > deprecated API introduced in v1.7. The error only concerns the renaming of > some macros from NPY_XXX to NPY_ARRAY_XXX. I was thinking to just check for > the Numpy version at build time and add corresponding #define to provide > the necessary renaming in case the build is done with Numpy v1.6. > > How can I robustly test for Numpy's version API in C ? > > Ghis > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Mon Mar 3 09:19:09 2014 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 3 Mar 2014 09:19:09 -0500 (EST) Subject: [Numpy-discussion] How security holes happen References: Message-ID: Todd Wrote in message: > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > use modern programming languages with well designed exception handling -- ----Android NewsGroup Reader---- http://www.piaohong.tk/newsgroup From ben.root at ou.edu Mon Mar 3 09:51:08 2014 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 3 Mar 2014 09:51:08 -0500 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: And, you know... unit tests to actually know if a the code would reject a spoofed certificate? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sc.kwok at hsantalucia.it Mon Mar 3 10:06:45 2014 From: sc.kwok at hsantalucia.it (Sze Chai kwok) Date: Mon, 03 Mar 2014 16:06:45 +0100 Subject: [Numpy-discussion] Installing In-Reply-To: <5314999E.30403@hsantalucia.it> References: <5314999E.30403@hsantalucia.it> Message-ID: <53149A85.7060701@hsantalucia.it> > I have this message when installing numpy, any ideas? Thanks > -- > Sze Chai Kwok, D.Phil > Neuroimaging Laboratory > Santa Lucia Foundation > Via Ardeatina 306 > 00179 Rome Italy > Tel: +39 06 5150 1459 -- Sze Chai Kwok, D.Phil Neuroimaging Laboratory Santa Lucia Foundation Via Ardeatina 306 00179 Rome Italy Tel: +39 06 5150 1459 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 43833 bytes Desc: not available URL: From sd at syntonetic.com Mon Mar 3 10:38:59 2014 From: sd at syntonetic.com (=?ISO-8859-1?Q?S=F8ren?=) Date: Mon, 03 Mar 2014 16:38:59 +0100 Subject: [Numpy-discussion] Installing In-Reply-To: <53149A85.7060701@hsantalucia.it> References: <5314999E.30403@hsantalucia.it> <53149A85.7060701@hsantalucia.it> Message-ID: <5314A213.7000104@syntonetic.com> Hi Sze You need Python 2.7.x 32-bit version installed. I experienced this once when I accidentally had the 64-bit version of Python installed. kind regards S?ren From charlesr.harris at gmail.com Mon Mar 3 11:23:07 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 3 Mar 2014 09:23:07 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. Message-ID: Hi All, Julian Taylor has put windows binaries and sources for the 1.8.1 release candidate up on sourceforge. If things go well, it will taken to a full release in a week or so. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Mar 3 11:59:10 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 3 Mar 2014 08:59:10 -0800 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: And significant indentation! really, no one beat me to that? ;-) There was a nice Blog post about this from a Google Chrome developer -- less critical than I'd think, who pointed out that it's really hard to write unit tests for this sort of thing, due to the need for a LOT of scaffolding -- but why integration tests didn't find it is beyond me.... Also -- code review anyone? (not that my code is well reviewed or thoroughly tested -- but I'm not writting security code used my millions of people...) The other oddity is that Apple is saying that they don't know when or how this got into the code -- do they REALY not have a decent version control system???? Or maybe they are being nice to whoever did make this mistake... -Chris On Mon, Mar 3, 2014 at 6:51 AM, Benjamin Root wrote: > And, you know... unit tests to actually know if a the code would reject a > spoofed certificate? > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Mar 3 12:23:33 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 3 Mar 2014 09:23:33 -0800 Subject: [Numpy-discussion] ndarray is not a sequence In-Reply-To: References: <780064138415269619.294722sturla.molden-gmail.com@news.gmane.org> <1393578222.6392.2.camel@sebastian-t440> <-5567695652207982944@unknownmsgid> Message-ID: On Sat, Mar 1, 2014 at 6:09 PM, Alexander Belopolsky wrote: > On Fri, Feb 28, 2014 at 10:34 AM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >> >> Whatever happened to duck typing? >> > > http://legacy.python.org/dev/peps/pep-3119/#abcs-vs-duck-typing > Sure -- but I'm afraid that there will be a lot of code that does an isinstance() check where it it absolutely unnecessary. If you really need to know if something is a sequence or a mapping, I suppose it's required, but how often is that? I suppose there are two sides to this coin: 1) Should numpy arrays use the ABC properly so that client code will recognise them as sequences -- probably, yes -- why not? (though the trick here is that numpy arrays do act differently than other mutable sequences -- view semantics and all -- so I think it's kind of dangerous) 2) If you are writing client code, should you use an isinstance() check passed-in objects? -- probably not! -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Mar 3 12:38:41 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 3 Mar 2014 09:38:41 -0800 Subject: [Numpy-discussion] GSOC 2014 : "Improve Numpy datetime functionality" In-Reply-To: References: Message-ID: On Sat, Mar 1, 2014 at 6:42 AM, faisal anees < mohammedfaisal.anees at students.iiit.ac.in> wrote: > I am Mohammed Faisal Anees , a Computer Science student at IIIT- > Hyderabad. I was going though the ideas page and I found "Improve Numpy > datetime functionality" really interesting , > It's great to have the interest! One trick is that I dont hink there is a consensus on what needs to be done to "improve datetime". One thing that does need to be done is cleaning up time zone handling -- and I think we ALMOST have consensus on how to do that (at least the quick fix part). Take a look for threads on this list in the last year about "datetime64" and this ticket: https://github.com/numpy/numpy/issues/3388 (and search for other datetime64 tickets...) Note that we are hoping to fix that for 1.9 -- anyone have a time scale for that? Too soon for GSoC? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From wrw at mac.com Mon Mar 3 13:39:48 2014 From: wrw at mac.com (William Ray Wing) Date: Mon, 03 Mar 2014 13:39:48 -0500 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: On Mar 3, 2014, at 11:59 AM, Chris Barker wrote: > And significant indentation! > > really, no one beat me to that? > > ;-) > > There was a nice Blog post about this from a Google Chrome developer -- less critical than I'd think, who pointed out that it's really hard to write unit tests for this sort of thing, due to the need for a LOT of scaffolding -- but why integration tests didn't find it is beyond me.... > > Also -- code review anyone? > > (not that my code is well reviewed or thoroughly tested -- but I'm not writting security code used my millions of people...) > > The other oddity is that Apple is saying that they don't know when or how this got into the code -- do they REALY not have a decent version control system???? Or maybe they are being nice to whoever did make this mistake... > > -Chris Apple has been known to contract out and/or buy some of its software from third parties. I wouldn?t be a bit surprised to discover that this was part of such a package. It represents such a common and fundamental library that it might well be the sort of thing they found it cheaper to buy. Of course, that begs a follow-on question or two - who else might be using it, and was the cost savings worth the loss of reputation? Bill From sebastian at sipsolutions.net Mon Mar 3 13:56:07 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 03 Mar 2014 19:56:07 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: Message-ID: <1393872967.10138.0.camel@sebastian-t440> On Mo, 2014-03-03 at 09:23 -0700, Charles R Harris wrote: > Hi All, > > > Julian Taylor has put windows binaries and sources for the 1.8.1 > release candidate up on sourceforge. If things go well, it will taken > to a full release in a week or so. > Thanks to both of you. Also for sieving through all those issues before :). - Sebastian > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Mon Mar 3 14:02:01 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 3 Mar 2014 20:02:01 +0100 Subject: [Numpy-discussion] GSOC 2014 : "Improve Numpy datetime functionality" In-Reply-To: References: Message-ID: On Mon, Mar 3, 2014 at 6:38 PM, Chris Barker wrote: > On Sat, Mar 1, 2014 at 6:42 AM, faisal anees < > mohammedfaisal.anees at students.iiit.ac.in> wrote: > >> I am Mohammed Faisal Anees , a Computer Science student at IIIT- >> Hyderabad. I was going though the ideas page and I found "Improve Numpy >> datetime functionality" really interesting , >> > > It's great to have the interest! > > One trick is that I dont hink there is a consensus on what needs to be > done to "improve datetime". > > One thing that does need to be done is cleaning up time zone handling -- > and I think we ALMOST have consensus on how to do that (at least the quick > fix part). > > Take a look for threads on this list in the last year about "datetime64" > and this ticket: > > https://github.com/numpy/numpy/issues/3388 > > (and search for other datetime64 tickets...) > > Note that we are hoping to fix that for 1.9 -- anyone have a time scale > for that? Too soon for GSoC? > Yes too soon and no, unlikely to happen given that no one volunteered yet. So seeing a solid GSoC proposal would be good. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Mar 3 14:20:31 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 03 Mar 2014 20:20:31 +0100 Subject: [Numpy-discussion] numpy gsoc topic idea: configurable algorithm precision and vector math library integration Message-ID: <5314D5FF.2070104@googlemail.com> hi, as the numpy gsoc topic page is a little short on options I was thinking about adding two topics for interested students. But as I have no experience with gsoc or mentoring and the ideas are not very fleshed out yet I'd like to ask if it might make sense at all: 1. configurable algorithm precision some functions in numpy could be implemented in different ways depending on requirements on speed and numerical precision. Two examples that come to my mind are hypot and sum hypot (or abs(complex)) use the C99 hypot function which guarantees 1ulp precision but is very slow compared to a simple sqrt(a**2 +b**2). This precision might not be required for all applications, overflow safety might be enough. summation in numpy 1.9 is performed via pairwise summation which has O(log(n)*e) error properties, but only for the fast axis. An alternative O(e) approach would be kahan summation which works for all axis but is 4 time slower than normal summation (a bit can be regained via vectorization thought) My idea is have an option to change the algorithms used in numpy depending on the set requirements. E.g. with np.precmode(default="fast"): np.abs(complex_array) or fast everything except sum and hypot with np.precmode(default="fast", sum="kahan", hypot="standard"): np.sum(d) I have not though much about implementation, it might be tricky to get this threadsafe in the current ufunc model. 2. vector math library integration some operations like powers, sin, cos etc are relatively slow in numpy depending on the c library used. There are now a few free libraries available that make use of modern hardware to speed these operations up, e.g. sleef and yeppp (also mkl but I have no interest in supporting non-free software) It might be interesting to investigate if these libraries can be integrated with numpy. This also somewhat ties in with the configurable precision mode as the vector math libraries often have different options depending on precision and speed requirements. Do those sound like topics we could add to our wiki? From Nicolas.Rougier at inria.fr Mon Mar 3 16:06:57 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Mon, 3 Mar 2014 22:06:57 +0100 Subject: [Numpy-discussion] dtype promotion Message-ID: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Hi all, I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand dtype promotion in the following case: >>> Z = np.zeros((2,2),dtype=np.float32) + 1 >>> print Z.dtype float32 >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) >>> print Z.dtype float64 Is this the expected behavior ? What it the difference between the two lines ? Nicolas From sebastian at sipsolutions.net Mon Mar 3 16:25:18 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 03 Mar 2014 22:25:18 +0100 Subject: [Numpy-discussion] dtype promotion In-Reply-To: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: <1393881918.2823.5.camel@sebastian-laptop> On Mon, 2014-03-03 at 22:06 +0100, Nicolas Rougier wrote: > Hi all, > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand dtype promotion in the following case: > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > >>> print Z.dtype > float32 > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > >>> print Z.dtype > float64 > > > Is this the expected behavior ? > What it the difference between the two lines ? > It is intended I guess, scalars (including 0-d arrays such `np.array(1)`) behave differently form normal arrays. Their type is not as important and in many cases with integers even the value gets important. I did not think through this exact case, and there are some funnier corners which have been discussed a lot, but basically you have to expect different casting when scalars are involved (don't trust the scalar dtype to win). (Of course in this context, you always have to imagin an `np.asarray` call) - Sebastian > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From hoogendoorn.eelco at gmail.com Mon Mar 3 16:26:20 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Mon, 3 Mar 2014 22:26:20 +0100 Subject: [Numpy-discussion] dtype promotion In-Reply-To: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: The tuple gets cast to an ndarray; which invokes a different codepath than the scalar addition. Somehow, numpy has gotten more aggressive at upcasting to float64 as of 1.8, but I havnt been able to discover the logic behind it either. On Mon, Mar 3, 2014 at 10:06 PM, Nicolas Rougier wrote: > > Hi all, > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand > dtype promotion in the following case: > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > >>> print Z.dtype > float32 > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > >>> print Z.dtype > float64 > > > Is this the expected behavior ? > What it the difference between the two lines ? > > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 3 16:42:58 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 3 Mar 2014 22:42:58 +0100 Subject: [Numpy-discussion] GSoC 2014 NumPy In-Reply-To: References: Message-ID: On Sun, Mar 2, 2014 at 5:12 AM, Leo Mao wrote: > Hello Ray, > Thanks for your suggestion! I just read the links you provided and I think > I can implement it as long as I do further research on zoom fft algorithm. > So I wonder if this can be a GSoC project? > By itself that's not enough for a GSoC project. > Maybe I should extend this idea or combine it with other ideas? > It's possible to come up with an interesting proposal in this area I think. An issue may be that the FFT code in numpy and scipy isn't very actively worked on at the moment, so finding a suitable mentor could be tricky. > BTW, just for curiosity, why we need both scipy.linalg and numpy.linalg? > Is implementing all functions in numpy a bad idea? > The overlap is mostly due to historical reasons. The long-term plan is to remove duplicate functions from scipy.linalg. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Mar 3 16:44:18 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 03 Mar 2014 22:44:18 +0100 Subject: [Numpy-discussion] dtype promotion In-Reply-To: References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: <1393883058.2823.9.camel@sebastian-laptop> On Mon, 2014-03-03 at 22:26 +0100, Eelco Hoogendoorn wrote: > The tuple gets cast to an ndarray; which invokes a different codepath > than the scalar addition. > > > Somehow, numpy has gotten more aggressive at upcasting to float64 as > of 1.8, but I havnt been able to discover the logic behind it either There were changes in the casting logic in 1.7, I think. I can't really remember changes after that, so if there is a change, we might want to check it out. (Or I am just missing something completly :)) - Sebastian > > On Mon, Mar 3, 2014 at 10:06 PM, Nicolas Rougier > wrote: > > Hi all, > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't > understand dtype promotion in the following case: > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > >>> print Z.dtype > float32 > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > >>> print Z.dtype > float64 > > > Is this the expected behavior ? > What it the difference between the two lines ? > > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Mon Mar 3 16:51:15 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 3 Mar 2014 22:51:15 +0100 Subject: [Numpy-discussion] numpy gsoc topic idea: configurable algorithm precision and vector math library integration In-Reply-To: <5314D5FF.2070104@googlemail.com> References: <5314D5FF.2070104@googlemail.com> Message-ID: On Mon, Mar 3, 2014 at 8:20 PM, Julian Taylor wrote: > hi, > > as the numpy gsoc topic page is a little short on options I was thinking > about adding two topics for interested students. But as I have no > experience with gsoc or mentoring and the ideas are not very fleshed out > yet I'd like to ask if it might make sense at all: > > 1. configurable algorithm precision > > some functions in numpy could be implemented in different ways depending > on requirements on speed and numerical precision. > Two examples that come to my mind are hypot and sum > > hypot (or abs(complex)) use the C99 hypot function which guarantees 1ulp > precision but is very slow compared to a simple sqrt(a**2 +b**2). > This precision might not be required for all applications, overflow > safety might be enough. > > summation in numpy 1.9 is performed via pairwise summation which has > O(log(n)*e) error properties, but only for the fast axis. > An alternative O(e) approach would be kahan summation which works for > all axis but is 4 time slower than normal summation (a bit can be > regained via vectorization thought) > > My idea is have an option to change the algorithms used in numpy > depending on the set requirements. > E.g. > > with np.precmode(default="fast"): > np.abs(complex_array) > > or fast everything except sum and hypot > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > np.sum(d) > > I have not though much about implementation, it might be tricky to get > this threadsafe in the current ufunc model. > > 2. vector math library integration > > some operations like powers, sin, cos etc are relatively slow in numpy > depending on the c library used. There are now a few free libraries > available that make use of modern hardware to speed these operations up, > e.g. sleef and yeppp (also mkl but I have no interest in supporting > non-free software) > It might be interesting to investigate if these libraries can be > integrated with numpy. > This also somewhat ties in with the configurable precision mode as the > vector math libraries often have different options depending on > precision and speed requirements. > > Do those sound like topics we could add to our wiki? > To me (2) sounds potentially very interesting, and definitely enough for a GSoC. Would need a very talented student to take this on. (1) I'm not sure about, somehow just doesn't sound like it would have a big impact. Also maybe not enough work to fill 3+ months with? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Mon Mar 3 17:02:28 2014 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 3 Mar 2014 17:02:28 -0500 Subject: [Numpy-discussion] dtype promotion In-Reply-To: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: IIRC, this is dependent on whether you are using 32bit versus 64bit numpy. All regular integer numbers can fit in 32 bits (is that right?), but the 1.1 is treated as a float32 if on a 32 bit NumPy or as float64 if on a 64 bit NumPy. That's my stab at it. Ben Root On Mon, Mar 3, 2014 at 4:06 PM, Nicolas Rougier wrote: > > Hi all, > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand > dtype promotion in the following case: > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > >>> print Z.dtype > float32 > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > >>> print Z.dtype > float64 > > > Is this the expected behavior ? > What it the difference between the two lines ? > > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Mon Mar 3 17:12:59 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Mon, 3 Mar 2014 23:12:59 +0100 Subject: [Numpy-discussion] dtype promotion In-Reply-To: References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: I never noticed this kind of cast before (1.8.0), it's just a bit surprising. It was convenient to write translations (for a bunch of points) such as: Z = np.ones((n,2),dtype=np.float32) + (300,300) but I can live with Z += 300,300 Nicolas On 03 Mar 2014, at 23:02, Benjamin Root wrote: > IIRC, this is dependent on whether you are using 32bit versus 64bit numpy. All regular integer numbers can fit in 32 bits (is that right?), but the 1.1 is treated as a float32 if on a 32 bit NumPy or as float64 if on a 64 bit NumPy. > > That's my stab at it. > > Ben Root > > > On Mon, Mar 3, 2014 at 4:06 PM, Nicolas Rougier wrote: > > Hi all, > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand dtype promotion in the following case: > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > >>> print Z.dtype > float32 > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > >>> print Z.dtype > float64 > > > Is this the expected behavior ? > What it the difference between the two lines ? > > > > Nicolas > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ben.root at ou.edu Mon Mar 3 17:17:18 2014 From: ben.root at ou.edu (Benjamin Root) Date: Mon, 3 Mar 2014 17:17:18 -0500 Subject: [Numpy-discussion] dtype promotion In-Reply-To: References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: Oops, I just now noticed that it was (1,1) and not (1.1). I really need to set a better font that makes the period and the comma more different... Ben Root On Mon, Mar 3, 2014 at 5:12 PM, Nicolas Rougier wrote: > > > I never noticed this kind of cast before (1.8.0), it's just a bit > surprising. > > It was convenient to write translations (for a bunch of points) such as: > > Z = np.ones((n,2),dtype=np.float32) + (300,300) > > but I can live with Z += 300,300 > > > Nicolas > > > On 03 Mar 2014, at 23:02, Benjamin Root wrote: > > > IIRC, this is dependent on whether you are using 32bit versus 64bit > numpy. All regular integer numbers can fit in 32 bits (is that right?), but > the 1.1 is treated as a float32 if on a 32 bit NumPy or as float64 if on a > 64 bit NumPy. > > > > That's my stab at it. > > > > Ben Root > > > > > > On Mon, Mar 3, 2014 at 4:06 PM, Nicolas Rougier < > Nicolas.Rougier at inria.fr> wrote: > > > > Hi all, > > > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand > dtype promotion in the following case: > > > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > > >>> print Z.dtype > > float32 > > > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > > >>> print Z.dtype > > float64 > > > > > > Is this the expected behavior ? > > What it the difference between the two lines ? > > > > > > > > Nicolas > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Mar 3 17:32:05 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 03 Mar 2014 23:32:05 +0100 Subject: [Numpy-discussion] dtype promotion In-Reply-To: References: <0F46B6BC-7D49-4591-AFFE-6E724CE95219@inria.fr> Message-ID: <1393885925.2823.12.camel@sebastian-laptop> On Mon, 2014-03-03 at 23:12 +0100, Nicolas Rougier wrote: > > I never noticed this kind of cast before (1.8.0), it's just a bit surprising. > > It was convenient to write translations (for a bunch of points) such as: > > Z = np.ones((n,2),dtype=np.float32) + (300,300) > > but I can live with Z += 300,300 > Just to note. That actually does the temporary cast anyway doing the calculation in double precision and then casting the result. If you want to make sure it stays in single precision you will need to make that an array with float32 dtype. - Sebastian > > Nicolas > > > On 03 Mar 2014, at 23:02, Benjamin Root wrote: > > > IIRC, this is dependent on whether you are using 32bit versus 64bit numpy. All regular integer numbers can fit in 32 bits (is that right?), but the 1.1 is treated as a float32 if on a 32 bit NumPy or as float64 if on a 64 bit NumPy. > > > > That's my stab at it. > > > > Ben Root > > > > > > On Mon, Mar 3, 2014 at 4:06 PM, Nicolas Rougier wrote: > > > > Hi all, > > > > I'm using numpy 1.8.0 (osx 10.9, python 2.7.6) and I can't understand dtype promotion in the following case: > > > > >>> Z = np.zeros((2,2),dtype=np.float32) + 1 > > >>> print Z.dtype > > float32 > > > > >>> Z = np.zeros((2,2),dtype=np.float32) + (1,1) > > >>> print Z.dtype > > float64 > > > > > > Is this the expected behavior ? > > What it the difference between the two lines ? > > > > > > > > Nicolas > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Mon Mar 3 22:17:06 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 04 Mar 2014 04:17:06 +0100 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: On 03/03/14 03:15, Charles R Harris wrote: > This is from OS X 9 > > if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) > goto fail; > if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) > goto fail; > goto fail; > if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) > goto fail; > > Heh, maybe there is a reason for braces in even the simplest if statements. It is quite evident in an editor with syntax highlighting. This is almost too good to be a coincidental coding error. If there ever were a deliberate backdoor attempt in an OS, it would be something like this. At least Apple shows us their Darwin code. Nobody get to scrutinize Microsoft's Windows code in public. I also amazed that the bugfix was a 500 MB download. Sturla -------------- next part -------------- A non-text attachment was scrubbed... Name: apple-goto-bug.png Type: image/png Size: 189574 bytes Desc: not available URL: From sturla.molden at gmail.com Mon Mar 3 22:32:11 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 04 Mar 2014 04:32:11 +0100 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: Gotos are indeed useful... Sturla -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2014-03-04 at 04.23.12.png Type: image/png Size: 81742 bytes Desc: not available URL: From sturla.molden at gmail.com Mon Mar 3 22:34:07 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 04 Mar 2014 04:34:07 +0100 Subject: [Numpy-discussion] How security holes happen In-Reply-To: References: Message-ID: Would you hire a programmer that worries about this? Sturla -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen Shot 2014-03-04 at 04.25.51.png Type: image/png Size: 71380 bytes Desc: not available URL: From thomas_unterthiner at web.de Tue Mar 4 07:49:29 2014 From: thomas_unterthiner at web.de (Thomas Unterthiner) Date: Tue, 04 Mar 2014 13:49:29 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: Message-ID: <5315CBD9.5040201@web.de> Hi there! I just tried setting up a new installation using numpy 1.8.1rc1 (+scipy 0.13.3 and matplotlib 1.3.1). I ran into problems when installing matplotlib 1.3.1. The attached logfile shows the full log, but it ends with: src/_png.cpp:329:15: error: 'npy_PyFile_Dup' was not declared in this scope if ((fp = npy_PyFile_Dup(py_file, "rb"))) ^ src/_png.cpp:577:13: error: 'npy_PyFile_DupClose' was not declared in this scope if (npy_PyFile_DupClose(py_file, fp)) { ^ error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 The problem went away (and matplotlib installed cleanly) when I re-did the whole shebang using numpy 1.8.0, so I suspect this was caused by something in the rc. Cheers Thomas On 2014-03-03 17:23, Charles R Harris wrote: > Hi All, > > Julian Taylor has put windows binaries and sources for the 1.8.1 > release candidate up on sourceforge > . If > things go well, it will taken to a full release in a week or so. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- $ OPT='-march=native' python setup.py install ============================================================================ Edit setup.cfg to change the build options BUILDING MATPLOTLIB matplotlib: yes [1.3.1] python: yes [2.7.5+ (default, Feb 27 2014, 19:37:08) [GCC 4.8.1]] platform: yes [linux2] REQUIRED DEPENDENCIES AND EXTENSIONS numpy: yes [version 1.8.1rc1] dateutil: yes [using dateutil version 1.5] tornado: yes [using tornado version 2.4.1] pyparsing: yes [using pyparsing version 2.0.1] pycxx: yes [Couldn't import. Using local copy.] libagg: yes [pkg-config information for 'libagg' could not be found. Using local copy.] freetype: yes [version 16.1.10] png: yes [version 1.2.49] OPTIONAL SUBPACKAGES sample_data: yes [installing] toolkits: yes [installing] tests: yes [using nose version 1.3.0] OPTIONAL BACKEND EXTENSIONS macosx: no [Mac OS-X only] qt4agg: yes [installing, Qt: 4.8.4, PyQt4: 4.10.3] gtk3agg: yes [installing, version 3.6.8] gtk3cairo: yes [installing, version 3.6.8] gtkagg: yes [installing, Gtk: 2.24.20 pygtk: 2.24.0] tkagg: no [The C/C++ header for Tk (tk.h) could not be found. You may need to install the development package.] wxagg: yes [installing, version 2.8.12.1] gtk: yes [installing, Gtk: 2.24.20 pygtk: 2.24.0] agg: yes [installing] cairo: yes [installing, version 1.8.8] windowing: no [Microsoft Windows only] OPTIONAL LATEX DEPENDENCIES dvipng: yes [version 1.14] ghostscript: yes [version 9.10] latex: yes [version 3.1415926] pdftops: yes [version 0.24.1] running install Checking .pth file support in /usr/local/lib/python2.7/dist-packages/ /usr/bin/python -E -c pass TEST PASSED: /usr/local/lib/python2.7/dist-packages/ appears to support .pth files running bdist_egg running egg_info writing requirements to lib/matplotlib.egg-info/requires.txt writing lib/matplotlib.egg-info/PKG-INFO writing namespace_packages to lib/matplotlib.egg-info/namespace_packages.txt writing top-level names to lib/matplotlib.egg-info/top_level.txt writing dependency_links to lib/matplotlib.egg-info/dependency_links.txt reading manifest file 'lib/matplotlib.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'lib/matplotlib.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py copying lib/matplotlib/mpl-data/matplotlibrc -> build/lib.linux-x86_64-2.7/matplotlib/mpl-data running build_ext building 'matplotlib._png' extension x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -march=native -fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_matplotlib__png_ARRAY_API -DPYCXX_ISO_CPP_LIB=1 -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I/usr/include -I. -I/usr/include/libpng12 -I/usr/include/python2.7 -c src/_png.cpp -o build/temp.linux-x86_64-2.7/src/_png.o In file included from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1761:0, from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/ndarrayobject.h:17, from /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/arrayobject.h:4, from src/_png.cpp:28: /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it by " \ ^ src/_png.cpp:147:51: error: macro "npy_PyFile_Dup" requires 3 arguments, but only 2 given if ((fp = npy_PyFile_Dup(py_file, (char *)"wb"))) ^ src/_png.cpp:243:48: error: macro "npy_PyFile_DupClose" requires 3 arguments, but only 2 given if (npy_PyFile_DupClose(py_file, fp)) { ^ src/_png.cpp:264:44: error: macro "npy_PyFile_DupClose" requires 3 arguments, but only 2 given if (npy_PyFile_DupClose(py_file, fp)) { ^ src/_png.cpp:329:43: error: macro "npy_PyFile_Dup" requires 3 arguments, but only 2 given if ((fp = npy_PyFile_Dup(py_file, "rb"))) ^ src/_png.cpp:577:44: error: macro "npy_PyFile_DupClose" requires 3 arguments, but only 2 given if (npy_PyFile_DupClose(py_file, fp)) { ^ In file included from src/file_compat.h:4:0, from src/_png.cpp:31: /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_3kcompat.h: In function ?PyObject* npy_PyFile_OpenFile(PyObject*, const char*)?: /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_3kcompat.h:288:60: warning: deprecated conversion from string constant to ?char*? [-Wwrite-strings] return PyObject_CallFunction(open, "Os", filename, mode); ^ /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_3kcompat.h: In function ?int npy_PyFile_CloseFile(PyObject*)?: /usr/local/lib/python2.7/dist-packages/numpy/core/include/numpy/npy_3kcompat.h:296:50: warning: deprecated conversion from string constant to ?char*? [-Wwrite-strings] ret = PyObject_CallMethod(file, "close", NULL); ^ src/_png.cpp: In member function ?Py::Object _png_module::write_png(const Py::Tuple&)?: src/_png.cpp:147:15: error: ?npy_PyFile_Dup? was not declared in this scope if ((fp = npy_PyFile_Dup(py_file, (char *)"wb"))) ^ src/_png.cpp:243:17: error: ?npy_PyFile_DupClose? was not declared in this scope if (npy_PyFile_DupClose(py_file, fp)) { ^ src/_png.cpp:264:13: error: ?npy_PyFile_DupClose? was not declared in this scope if (npy_PyFile_DupClose(py_file, fp)) { ^ src/_png.cpp: In member function ?PyObject* _png_module::_read_png(const Py::Object&, bool, int)?: src/_png.cpp:329:15: error: ?npy_PyFile_Dup? was not declared in this scope if ((fp = npy_PyFile_Dup(py_file, "rb"))) ^ src/_png.cpp:577:13: error: ?npy_PyFile_DupClose? was not declared in this scope if (npy_PyFile_DupClose(py_file, fp)) { ^ error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 From sturla.molden at gmail.com Tue Mar 4 09:47:50 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 4 Mar 2014 14:47:50 +0000 (UTC) Subject: [Numpy-discussion] ndarray is not a sequence References: <780064138415269619.294722sturla.molden-gmail.com@news.gmane.org> <1393578222.6392.2.camel@sebastian-t440> <-5567695652207982944@unknownmsgid> Message-ID: <1868000954415636263.145116sturla.molden-gmail.com@news.gmane.org> Chris Barker wrote: > Sure -- but I'm afraid that there will be a lot of code that does an > isinstance() check where it it absolutely unnecessary. If you really need > to know if something is a sequence or a mapping, I suppose it's required, > but how often is that? I must say I don't understand the purpose of Java-like interfaces in Python. It is better to ask forgiveness than ask permission. If an object does not have the required attributes we sooner or later get an AttributeError. Is doing an isinstance() check and then raising TypeError inherently better? In both cases we get an error that unit tests should detect. I might do sanity checks on the input data if it is required to ensure correct output. But I never do this to ensure that objects have all the attributes they should. If they don't I prefer my program to die with an unhandled AttributeError. Sturla From cgohlke at uci.edu Tue Mar 4 12:08:36 2014 From: cgohlke at uci.edu (Christoph Gohlke) Date: Tue, 04 Mar 2014 09:08:36 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: <5315CBD9.5040201@web.de> References: <5315CBD9.5040201@web.de> Message-ID: <53160894.7070207@uci.edu> On 3/4/2014 4:49 AM, Thomas Unterthiner wrote: > Hi there! > > I just tried setting up a new installation using numpy 1.8.1rc1 (+scipy > 0.13.3 and matplotlib 1.3.1). I ran into problems when installing > matplotlib 1.3.1. The attached logfile shows the full log, but it ends with: > > src/_png.cpp:329:15: error: ?npy_PyFile_Dup? was not declared in this scope > if ((fp = npy_PyFile_Dup(py_file, "rb"))) > ^ > src/_png.cpp:577:13: error: ?npy_PyFile_DupClose? was not declared in > this scope > if (npy_PyFile_DupClose(py_file, fp)) { > ^ > error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 > > > The problem went away (and matplotlib installed cleanly) when I re-did > the whole shebang using numpy 1.8.0, so I suspect this was caused by > something in the rc. > > Cheers > > Thomas > > This error is known and expected. It is due to an API change in a semi-private numpy header and it is fixed in matplotlib master and v1.3.x. Christoph From jtaylor.debian at googlemail.com Tue Mar 4 12:29:37 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 04 Mar 2014 18:29:37 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: <53160894.7070207@uci.edu> References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> Message-ID: <53160D81.3030804@googlemail.com> On 04.03.2014 18:08, Christoph Gohlke wrote: > On 3/4/2014 4:49 AM, Thomas Unterthiner wrote: >> Hi there! >> >> I just tried setting up a new installation using numpy 1.8.1rc1 (+scipy >> 0.13.3 and matplotlib 1.3.1). I ran into problems when installing >> matplotlib 1.3.1. The attached logfile shows the full log, but it ends with: >> >> src/_png.cpp:329:15: error: ?npy_PyFile_Dup? was not declared in this scope >> if ((fp = npy_PyFile_Dup(py_file, "rb"))) >> ^ >> src/_png.cpp:577:13: error: ?npy_PyFile_DupClose? was not declared in >> this scope >> if (npy_PyFile_DupClose(py_file, fp)) { >> ^ >> error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 >> >> >> The problem went away (and matplotlib installed cleanly) when I re-did >> the whole shebang using numpy 1.8.0, so I suspect this was caused by >> something in the rc. >> >> Cheers >> >> Thomas >> >> > > > This error is known and expected. It is due to an API change in a > semi-private numpy header and it is fixed in matplotlib master and v1.3.x. > > > > > hm breaking released matplotlib is bad, I though matplotlib didn't use that function, I could have sworn I checked that. I guess we will have to revert this change to an internal duplicate. From cmkleffner at gmail.com Wed Mar 5 03:57:13 2014 From: cmkleffner at gmail.com (Carl Kleffner) Date: Wed, 5 Mar 2014 09:57:13 +0100 Subject: [Numpy-discussion] numpy gsoc topic idea: configurable algorithm precision and vector math library integration In-Reply-To: References: <5314D5FF.2070104@googlemail.com> Message-ID: I want to point out, that Intel provides a very interesting OSS compiler based on LLVM that targets vectorized code for SSE, AVX instructions on x86 and x86_64 platforms. Carl https://github.com/ispc/ispc/ http://ispc.github.io/ quote: ispc is a compiler for a variant of the C programming language, with extensions for "single program, multiple data" (SPMD) programming. Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. (See the ispc documentation for more details and examples that illustrate this concept.) ... ispc is an open source compiler with a BSD license. It uses the remarkable LLVM Compiler Infrastructure for back-end code generation and optimization and is hosted on github. It supports Windows, Mac, and Linux, with both x86 and x86-64 targets. It currently supports the SSE2, SSE4, AVX1, AVX2, and Xeon Phi "Knight's Corner" instruction sets. 2014-03-03 22:51 GMT+01:00 Ralf Gommers : > > > > On Mon, Mar 3, 2014 at 8:20 PM, Julian Taylor < > jtaylor.debian at googlemail.com> wrote: > >> hi, >> >> as the numpy gsoc topic page is a little short on options I was thinking >> about adding two topics for interested students. But as I have no >> experience with gsoc or mentoring and the ideas are not very fleshed out >> yet I'd like to ask if it might make sense at all: >> >> 1. configurable algorithm precision >> >> some functions in numpy could be implemented in different ways depending >> on requirements on speed and numerical precision. >> Two examples that come to my mind are hypot and sum >> >> hypot (or abs(complex)) use the C99 hypot function which guarantees 1ulp >> precision but is very slow compared to a simple sqrt(a**2 +b**2). >> This precision might not be required for all applications, overflow >> safety might be enough. >> >> summation in numpy 1.9 is performed via pairwise summation which has >> O(log(n)*e) error properties, but only for the fast axis. >> An alternative O(e) approach would be kahan summation which works for >> all axis but is 4 time slower than normal summation (a bit can be >> regained via vectorization thought) >> >> My idea is have an option to change the algorithms used in numpy >> depending on the set requirements. >> E.g. >> >> with np.precmode(default="fast"): >> np.abs(complex_array) >> >> or fast everything except sum and hypot >> >> with np.precmode(default="fast", sum="kahan", hypot="standard"): >> np.sum(d) >> >> I have not though much about implementation, it might be tricky to get >> this threadsafe in the current ufunc model. >> >> 2. vector math library integration >> >> some operations like powers, sin, cos etc are relatively slow in numpy >> depending on the c library used. There are now a few free libraries >> available that make use of modern hardware to speed these operations up, >> e.g. sleef and yeppp (also mkl but I have no interest in supporting >> non-free software) >> It might be interesting to investigate if these libraries can be >> integrated with numpy. >> This also somewhat ties in with the configurable precision mode as the >> vector math libraries often have different options depending on >> precision and speed requirements. >> >> Do those sound like topics we could add to our wiki? >> > > To me (2) sounds potentially very interesting, and definitely enough for a > GSoC. Would need a very talented student to take this on. > > (1) I'm not sure about, somehow just doesn't sound like it would have a > big impact. Also maybe not enough work to fill 3+ months with? > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sudheer.iiita at gmail.com Wed Mar 5 05:02:36 2014 From: sudheer.iiita at gmail.com (Sudheer Singh) Date: Wed, 5 Mar 2014 15:32:36 +0530 Subject: [Numpy-discussion] Implementing Levenberg-Marquardt with additional feature Message-ID: Hello Everyone !! I am Sudheer singh , an information technology student at IIIT - ALLAHABAD. I'm interested in contributing to Numpy.I was going through Idea page and I found Implementing "Levenberg-Marquardt " with additional feature like inequality constraints and sparse Jacobian matrix. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Mar 5 11:25:09 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 5 Mar 2014 16:25:09 +0000 (UTC) Subject: [Numpy-discussion] Implementing Levenberg-Marquardt with additional feature References: Message-ID: <709789436415727166.453137sturla.molden-gmail.com@news.gmane.org> Sudheer Singh wrote: > Hello Everyone !! I am Sudheer singh , an information technology student > at IIIT - ALLAHABAD. I'm interested in contributing to Numpy.I was going > through Idea page and I found Implementing "Levenberg-Marquardt " with > additional feature like inequality constraints and sparse Jacobian matrix. You should rather post this on the SciPy lists :) Sturla From sebastian at sipsolutions.net Wed Mar 5 11:45:47 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 05 Mar 2014 17:45:47 +0100 Subject: [Numpy-discussion] Adding weights to cov and corrcoef Message-ID: <1394037947.21356.20.camel@sebastian-t440> Hi all, in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe suggested adding new parameters to our `cov` and `corrcoef` functions to implement weights, which already exists for `average` (the PR still needs to be adapted). The idea right now would be to add a `weights` and a `frequencies` keyword arguments to these functions. In more detail: The situation is a bit more complex for `cov` and `corrcoef` than `average`, because there are different types of weights. The current plan would be to add two new keyword arguments: * weights: Uncertainty weights which causes `N` to be recalculated accordingly (This is R's `cov.wt` default I believe). * frequencies: When given, `N = sum(frequencies)` and the values are weighted by their frequency. Because it appeared that the uncertainty type of weights are not obvious, while other types of weights should be pretty easily implemented by scaling `frequencies` (i.e. one may want `sum(frequencies) == len(data)`). However, we may have missed something obvious, or maybe it is already getting too statistical for NumPy, or the keyword argument might be better `uncertainties` and `frequencies`. So comments and insights are very welcome :). Regards, Sebastian From lmao20001 at gmail.com Wed Mar 5 11:52:42 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Thu, 6 Mar 2014 00:52:42 +0800 Subject: [Numpy-discussion] GSoC 2014 NumPy In-Reply-To: References: Message-ID: On Tue, Mar 4, 2014 at 5:42 AM, Ralf Gommers wrote: > > It's possible to come up with an interesting proposal in this area I > think. An issue may be that the FFT code in numpy and scipy isn't very > actively worked on at the moment, so finding a suitable mentor could be > tricky. > So should I choose another topic? Actually, I just read the thread "numpy gsoc topic idea" and I found that the idea "vector math library integration" really interests me! And I have a qeustion: how can I find a suitable mentor? Currently I'm digging into the source of numpy and trying to make a small pull request. Also I keep thinking how to write a proper proposal. Are these enough? Can I do more for now? I will be grateful for any advice. Thanks in advance. Regards, Leo Mao -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.l.goldsmith at gmail.com Wed Mar 5 13:21:57 2014 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Wed, 5 Mar 2014 10:21:57 -0800 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) Message-ID: Date: Wed, 05 Mar 2014 17:45:47 +0100 > From: Sebastian Berg > Subject: [Numpy-discussion] Adding weights to cov and corrcoef > To: numpy-discussion at scipy.org > Message-ID: <1394037947.21356.20.camel at sebastian-t440> > Content-Type: text/plain; charset="UTF-8" > > Hi all, > > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe > suggested adding new parameters to our `cov` and `corrcoef` functions to > implement weights, which already exists for `average` (the PR still > needs to be adapted). > Do you mean adopted? > However, we may have missed something obvious, or maybe it is already > getting too statistical for NumPy, or the keyword argument might be > better `uncertainties` and `frequencies`. So comments and insights are > very welcome :). > +1 for it being "too baroque" for NumPy--should go in SciPy (if it isn't already there): IMHO, NumPy should be kept as "lean and mean" as possible, embellishments are what SciPy is for. (Again, IMO.) DG -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 5 14:37:06 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Mar 2014 20:37:06 +0100 Subject: [Numpy-discussion] EuroSciPy 2014 Call for Abstracts Message-ID: Dear all, EuroSciPy 2014, the Seventh Annual Conference on Python in Science, takes place in Cambridge, UK on 27 - 30 August 2013. The conference features two days of tutorials followed by two days of scientific talks. The day after the main conference, developer sprints will be organized on projects of interest to attendees. The topics presented at EuroSciPy are very diverse, with a focus on advanced software engineering and original uses of Python and its scientific libraries, either in theoretical or experimental research, from both academia and the industry. The program includes keynotes, contributed talks and posters. Submissions for talks and posters are welcome on our website (http://www. euroscipy.org/2014/). In your abstract, please provide details on what Python tools are being employed, and how. The deadline for submission is 14 April 2013. Also until 14 April 2014, you can apply for a sprint session on 31 August 2014. See https://www.euroscipy.org/2014/calls/sprints/ for details. Important dates: April 14th: Presentation abstracts, poster, tutorial submission deadline. Application for sponsorship deadline. May 17th: Speakers selected May 22nd: Sponsorship acceptance deadline June 1st: Speaker schedule announced June 6th, or 150 registrants: Early-bird registration ends August 27-31st: 2 days of tutorials, 2 days of conference, 1 day of sprints We look forward to an exciting conference and hope to see you in Cambridge in August! The EuroSciPy 2014 Team http://www.euroscipy.org/2014/ Conference Chairs -------------------------- Mark Hayes, Cambridge University, UK Didrik Pinte, Enthought Europe, UK Tutorial Chair ------------------- David Cournapeau, Enthought Europe, UK Program Chair -------------------- Ralf Gommers, ASML, The Netherlands Program Committee ----------------------------- Tiziano Zito, Humboldt-Universit?t zu Berlin, Germany Pierre de Buyl, Universit? libre de Bruxelles, Belgium Emmanuelle Gouillart, Joint Unit CNRS/Saint-Gobain, France Konrad Hinsen, Centre National de la Recherche Scientifique (CNRS), France Raphael Ritz, Garching Computing Centre of the Max Planck Society, Germany St?fan van der Walt, Applied Mathematics, Stellenbosch University, South Africa Ga?l Varoquaux, INRIA Parietal, Saclay, France Nelle Varoquaux, Mines ParisTech, France Pauli Virtanen, Aalto University, Finland Evgeni Burovski, Lancaster University, UK Robert Cimrman, New Technologies Research Centre, University of West Bohemia, Czech Republic Almar Klein, Cybermind, The Netherlands Organizing Committee ------------------------------ Simon Jagoe, Enthought Europe, UK Pierre de Buyl, Universit? libre de Bruxelles, Belgium -------------- next part -------------- An HTML attachment was scrubbed... URL: From marquett at iap.fr Wed Mar 5 14:39:51 2014 From: marquett at iap.fr (Jean-Baptiste Marquette) Date: Wed, 5 Mar 2014 20:39:51 +0100 Subject: [Numpy-discussion] EuroSciPy 2014 Call for Abstracts In-Reply-To: References: Message-ID: Hi Ralf, > EuroSciPy 2014, the Seventh Annual Conference on Python in Science, takes place in Cambridge, UK on 27 - 30 August 2013. The conference features two days of tutorials followed by two days of scientific talks. The day after the main conference, developer sprints will be organized on projects of interest to attendees. > > The topics presented at EuroSciPy are very diverse, with a focus on advanced software engineering and original uses of Python and its scientific libraries, either in theoretical or experimental research, from both academia and the industry. The program includes keynotes, contributed talks and posters. > > Submissions for talks and posters are welcome on our website (http://www.euroscipy.org/2014/). In your abstract, please provide details on what Python tools are being employed, and how. The deadline for submission is 14 April 2013. Some dates refer to 2013? Cheers, JB -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 5 14:43:22 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Mar 2014 20:43:22 +0100 Subject: [Numpy-discussion] EuroSciPy 2014 Call for Abstracts In-Reply-To: References: Message-ID: On Wed, Mar 5, 2014 at 8:39 PM, Jean-Baptiste Marquette wrote: > Hi Ralf, > > EuroSciPy 2014, the Seventh Annual Conference on Python in Science, takes > place in Cambridge, UK on 27 - 30 August 2013. The conference features two > days of tutorials followed by two days of scientific talks. The day after > the main conference, developer sprints will be organized on projects of > interest to attendees. > > The topics presented at EuroSciPy are very diverse, with a focus on > advanced software engineering and original uses of Python and its > scientific libraries, either in theoretical or experimental research, from > both academia and the industry. The program includes keynotes, contributed > talks and posters. > > Submissions for talks and posters are welcome on our website (http://www. > euroscipy.org/2014/). In your abstract, please provide details on what > Python tools are being employed, and how. The deadline for submission is 14 > April 2013. > > > Some dates refer to 2013... > Hmm, that's why one shouldn't send emails like these at the end of a long day..... Dates are correct except for 2013-->2014. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From marquett at iap.fr Wed Mar 5 14:50:18 2014 From: marquett at iap.fr (Jean-Baptiste Marquette) Date: Wed, 5 Mar 2014 20:50:18 +0100 Subject: [Numpy-discussion] EuroSciPy 2014 Call for Abstracts In-Reply-To: References: Message-ID: <2CD5FBCC-EC1E-4D23-8620-9804A347D483@iap.fr> Le 5 mars 2014 ? 20:43, Ralf Gommers a ?crit : > Hmm, that's why one shouldn't send emails like these at the end of a long day..... Dates are correct except for 2013-->2014. ? It's only those who do nothing that make no mistakes, I suppose. ? Joseph Conrad (An Outcast of the Islands, 1896, pt. 3, chap. 2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 5 15:21:13 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Mar 2014 21:21:13 +0100 Subject: [Numpy-discussion] GSoC 2014 NumPy In-Reply-To: References: Message-ID: On Wed, Mar 5, 2014 at 5:52 PM, Leo Mao wrote: > On Tue, Mar 4, 2014 at 5:42 AM, Ralf Gommers wrote: > >> >> It's possible to come up with an interesting proposal in this area I >> think. An issue may be that the FFT code in numpy and scipy isn't very >> actively worked on at the moment, so finding a suitable mentor could be >> tricky. >> > > So should I choose another topic? > I suspect that that may be better. > Actually, I just read the thread "numpy gsoc topic idea" and I found that > the idea "vector math library integration" really interests me! > If it's interesting, I suggest to dive in. It's Julian's idea, so probably he's interested to (co-)mentor a project on this topic. Julian? > And I have a qeustion: how can I find a suitable mentor? > That's a nontrivial question. Let me send a separate email to the list about that. > Currently I'm digging into the source of numpy and trying to make a small > pull request. Also I keep thinking how to write a proper proposal. > Are these enough? Can I do more for now? > Keep in mind that your proposal will need to go through a few rounds of feedback and rework before it's solid enough that you can submit it. The same usually goes for pull requests. Cheers, Ralf > I will be grateful for any advice. > Thanks in advance. > > Regards, > Leo Mao > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 5 15:30:36 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 5 Mar 2014 21:30:36 +0100 Subject: [Numpy-discussion] GSoC: ideas & finding mentors Message-ID: Hi students, There is quite a bit of interest in GSoC ideas for Scipy and Numpy, which is great to see. The official application period to submit proposals opens next week and closes on the 21st, which is in two weeks and a bit. So now is the time to start discussing draft proposals on the list. There have been a few ideas posted on the list which haven't gotten enough feedback yet (FFTs, cluster, ODEs). This may reflect the lack of an active maintainer of those modules, so it will be harder to find a suitable mentor. I want to point out that this is also a chicken-and-egg problem: if you're actively posting and improving your draft and sending some pull requests to fix some small issues, it shows both your willingness to work with the community and how you work with core devs to get your PRs merged, which helps find an interested mentor. To tackle the student-mentor matchmaking from another angle, I've added on https://github.com/scipy/scipy/wiki/GSoC-project-ideas a "potential mentors" field to the idea I know the names for (Stefan and me, for wavelets). If other potential mentors could do the same for other ideas, that would be very helpful. I can take some guesses (Pauli, Evgeni for splines? Chuck, Chris Barker for datetime?) but I haven't added any names. So please do add your name, keeping in mind that this is to get the process going and not yet a full commitment. Final note: you don't necessarily have to be a core developer to be a co-mentor. If you're an expert on a topic that a student is interested in and would like to see that project happen, please indicate you're willing to help. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 5 16:11:40 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 5 Mar 2014 21:11:40 +0000 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) Message-ID: On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor wrote: > hi, > > as the numpy gsoc topic page is a little short on options I was thinking > about adding two topics for interested students. But as I have no > experience with gsoc or mentoring and the ideas are not very fleshed out > yet I'd like to ask if it might make sense at all: > > 1. configurable algorithm precision [...] > with np.precmode(default="fast"): > np.abs(complex_array) > > or fast everything except sum and hypot > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > np.sum(d) [...] Not a big fan of this one -- it seems like the biggest bulk of the effort would be in figuring out a non-horrible API for exposing these things and getting consensus around it, which is not a good fit to the SoC structure. I'm pretty nervous about the datetime proposal that's currently on the wiki, for similar reasons -- I'm not sure it's actually doable in the SoC context. > 2. vector math library integration This is a great suggestion -- clear scope, clear benefit. Two more ideas: 3. Using Cython in the numpy core The numpy core contains tons of complicated C code implementing elaborate operations like indexing, casting, ufunc dispatch, etc. It would be really nice if we could use Cython to write some of these things. However, there is a practical problem: Cython assumes that each .pyx file generates a single compiled module with its own Cython-defined API. Numpy, however, contains a large number of .c files which are all compiled together into a single module, with its own home-brewed system for defining the public API. And we can't rewrite the whole thing. So for this to be viable, we would need some way to compile a bunch of .c *and .pyx* files together into a single module, and allow the .c and .pyx files to call each other. This might involve changes to Cython, some sort of clever post-processing or glue code to get existing cython-generated source code to play nicely with the rest of numpy, or something else. So this project would have the following goals, depending on how practical this turns out to be: (1) produce a hacky proof-of-concept system for doing the above, (2) turn the hacky proof-of-concept into something actually viable for use in real life (possibly this would require getting changes upstream into Cython, etc.), (3) use this system to actually port some interesting numpy code into cython. 4. Pythonic dtypes The current dtype system is klugey. It basically defines its own class system, in parallel to Python's, and unsurprisingly, this new class system is not as good. In particular, it has limitations around the storage of instance-specific data which rule out a large variety of interesting user-defined dtypes, and causes us to need some truly nasty hacks to support the built-in dtypes we do have. And it makes defining a new dtype much more complicated than defining a new Python class. This project would be to implement a new dtype system for numpy, in which np.dtype becomes a near-empty base class, different dtypes (e.g., float64, float32) are simply different subclasses of np.dtype, and dtype objects are simply instances of these classes. Further enhancements would be to make it possible to define new dtypes in pure Python by subclassing np.dtype and implementing special methods for the various dtype operations, and to make it possible for ufunc loops to see the dtype objects. This project would provide the key enabling piece for a wide variety of interesting new features: missing value support, better handling of strings and categorical data, unit handling, automatic differentiation, and probably a bunch more I'm forgetting right now. If we get someone who's up to handling the dtype thing then I can mentor or co-mentor. What do y'all think? (I don't think I have access to update that wiki page -- or maybe I'm just not clever enough to figure out how -- so it would be helpful if someone who can, could?) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From matthew.brett at gmail.com Wed Mar 5 18:29:51 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 5 Mar 2014 15:29:51 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: <53160D81.3030804@googlemail.com> References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, I built (and tested) some numpy wheels for the rc1: http://nipy.bic.berkeley.edu/numpy-dist/ Cheers, Matthew From matthew.brett at gmail.com Wed Mar 5 21:28:21 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 5 Mar 2014 18:28:21 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett wrote: > Hi, > > I built (and tested) some numpy wheels for the rc1: > > http://nipy.bic.berkeley.edu/numpy-dist/ Now building, installing, testing, uploading wheels nightly on OSX 10.9: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 and downloading, testing built wheels on OSX 10.6: http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded Chuck - are you release manager for this cycle? Would you mind sending me your public ssh key so I can give you access to the buildbots for custom builds and so on? Cheers, Matthew From sturla.molden at gmail.com Thu Mar 6 00:17:32 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 6 Mar 2014 05:17:32 +0000 (UTC) Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) References: Message-ID: <747438632415775353.696096sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > 3. Using Cython in the numpy core > > The numpy core contains tons of complicated C code implementing > elaborate operations like indexing, casting, ufunc dispatch, etc. It > would be really nice if we could use Cython to write some of these > things. So the idea of having a NumPy as a pure C library in the core is abandoned? > However, there is a practical problem: Cython assumes that > each .pyx file generates a single compiled module with its own > Cython-defined API. Numpy, however, contains a large number of .c > files which are all compiled together into a single module, with its > own home-brewed system for defining the public API. And we can't > rewrite the whole thing. So for this to be viable, we would need some > way to compile a bunch of .c *and .pyx* files together into a single > module, and allow the .c and .pyx files to call each other. Cython takes care of that already. http://docs.cython.org/src/userguide/sharing_declarations.html#cimport http://docs.cython.org/src/userguide/external_C_code.html#using-cython-declarations-from-c Sturla From cournape at gmail.com Thu Mar 6 04:11:41 2014 From: cournape at gmail.com (David Cournapeau) Date: Thu, 6 Mar 2014 09:11:41 +0000 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: References: Message-ID: On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith wrote: > On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor > wrote: > > hi, > > > > as the numpy gsoc topic page is a little short on options I was thinking > > about adding two topics for interested students. But as I have no > > experience with gsoc or mentoring and the ideas are not very fleshed out > > yet I'd like to ask if it might make sense at all: > > > > 1. configurable algorithm precision > [...] > > with np.precmode(default="fast"): > > np.abs(complex_array) > > > > or fast everything except sum and hypot > > > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > > np.sum(d) > [...] > > Not a big fan of this one -- it seems like the biggest bulk of the > effort would be in figuring out a non-horrible API for exposing these > things and getting consensus around it, which is not a good fit to the > SoC structure. > > I'm pretty nervous about the datetime proposal that's currently on the > wiki, for similar reasons -- I'm not sure it's actually doable in the > SoC context. > > > 2. vector math library integration > > This is a great suggestion -- clear scope, clear benefit. > > Two more ideas: > > 3. Using Cython in the numpy core > > The numpy core contains tons of complicated C code implementing > elaborate operations like indexing, casting, ufunc dispatch, etc. It > would be really nice if we could use Cython to write some of these > things. However, there is a practical problem: Cython assumes that > each .pyx file generates a single compiled module with its own > Cython-defined API. Numpy, however, contains a large number of .c > files which are all compiled together into a single module, with its > own home-brewed system for defining the public API. And we can't > rewrite the whole thing. So for this to be viable, we would need some > way to compile a bunch of .c *and .pyx* files together into a single > module, and allow the .c and .pyx files to call each other. This might > involve changes to Cython, some sort of clever post-processing or glue > code to get existing cython-generated source code to play nicely with > the rest of numpy, or something else. > > So this project would have the following goals, depending on how > practical this turns out to be: (1) produce a hacky proof-of-concept > system for doing the above, (2) turn the hacky proof-of-concept into > something actually viable for use in real life (possibly this would > require getting changes upstream into Cython, etc.), (3) use this > system to actually port some interesting numpy code into cython. > Having to synchronise two projects may be hard for a GSoC, no ? Otherwise, I am a bit worried about cython being used on the current C code as is, because core and python C API are so interwined (especially multiarray). Maybe one could use cython on the non-core numpy parts that are still in C ? It is not as sexy of a project, though. > > 4. Pythonic dtypes > > The current dtype system is klugey. It basically defines its own class > system, in parallel to Python's, and unsurprisingly, this new class > system is not as good. In particular, it has limitations around the > storage of instance-specific data which rule out a large variety of > interesting user-defined dtypes, and causes us to need some truly > nasty hacks to support the built-in dtypes we do have. And it makes > defining a new dtype much more complicated than defining a new Python > class. > > This project would be to implement a new dtype system for numpy, in > which np.dtype becomes a near-empty base class, different dtypes > (e.g., float64, float32) are simply different subclasses of np.dtype, > and dtype objects are simply instances of these classes. Further > enhancements would be to make it possible to define new dtypes in pure > Python by subclassing np.dtype and implementing special methods for > the various dtype operations, and to make it possible for ufunc loops > to see the dtype objects. > > This project would provide the key enabling piece for a wide variety > of interesting new features: missing value support, better handling of > strings and categorical data, unit handling, automatic > differentiation, and probably a bunch more I'm forgetting right now. > > If we get someone who's up to handling the dtype thing then I can > mentor or co-mentor. > > What do y'all think? > > (I don't think I have access to update that wiki page -- or maybe I'm > just not clever enough to figure out how -- so it would be helpful if > someone who can, could?) > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Thu Mar 6 05:55:42 2014 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Thu, 6 Mar 2014 11:55:42 +0100 Subject: [Numpy-discussion] numpy gsoc topic idea: configurable algorithm precision and vector math library integration In-Reply-To: <5314D5FF.2070104@googlemail.com> References: <5314D5FF.2070104@googlemail.com> Message-ID: <5EE72FBD-D063-4517-A986-A19CA8777613@gmail.com> Am 03.03.2014 um 20:20 schrieb Julian Taylor : > hi, > > as the numpy gsoc topic page is a little short on options I was thinking > about adding two topics for interested students. But as I have no > experience with gsoc or mentoring and the ideas are not very fleshed out > yet I'd like to ask if it might make sense at all: > > > 2. vector math library integration > > some operations like powers, sin, cos etc are relatively slow in numpy > depending on the c library used. There are now a few free libraries > available that make use of modern hardware to speed these operations up, > e.g. sleef and yeppp (also mkl but I have no interest in supporting > non-free software) > It might be interesting to investigate if these libraries can be > integrated with numpy. > This also somewhat ties in with the configurable precision mode as the > vector math libraries often have different options depending on > precision and speed requirements. I have been exhuming an old package I once wrote that wraps the vectorized math functions from Intels MKL for use with numpy. I made it available at https://github.com/geggo/uvml . I don't have access to MKL anymore, so no idea whether this package still works with recent numpy. If still useful, adapting to work with other libraries should not be difficult since they all provide a similar API. For serious work other packages like numexpr, numba or theano are much better. Nevertheless some might want to pick up this approach. Gregor From albert.jornet at ic3.cat Thu Mar 6 06:17:34 2014 From: albert.jornet at ic3.cat (Albert Jornet Puig) Date: Thu, 06 Mar 2014 12:17:34 +0100 Subject: [Numpy-discussion] numpy apply_along_axis named arguments Message-ID: <5318594E.1020401@ic3.cat> Hi All, I am working with *apply_along_axis* method and I would like to apply a method that requires to pass named arguments (scipy.stats.mstats.mquantiles with prob[]). But currently, it is not possible with *apply_along_axis*. I wonder if it would make sense to add the possibility to pass named arguments. I am also aware that It could be implement using other ways (a loop for each row). That's why I would like to ask whether it makes sense or not to ask for this. Even though, I managed to modify the code, which is simple. http://pastebin.com/pBn0TbgK -------------- next part -------------- An HTML attachment was scrubbed... URL: From albert.jornet at ic3.cat Thu Mar 6 07:23:17 2014 From: albert.jornet at ic3.cat (Albert Jornet Puig) Date: Thu, 06 Mar 2014 13:23:17 +0100 Subject: [Numpy-discussion] numpy apply_along_axis named arguments In-Reply-To: <5318594E.1020401@ic3.cat> References: <5318594E.1020401@ic3.cat> Message-ID: <531868B5.9070706@ic3.cat> Please, find below the patch file for numpy 1.8.0 http://pastebin.com/D33fFpjH On 06/03/14 12:17, Albert Jornet Puig wrote: > Hi All, > > I am working with *apply_along_axis* method and I would like to apply > a method that requires to pass named arguments > (scipy.stats.mstats.mquantiles with prob[]). But currently, it is not > possible with *apply_along_axis*. > > I wonder if it would make sense to add the possibility to pass named > arguments. I am also aware that It could be implement using other ways > (a loop for each row). That's why I would like to ask whether it makes > sense or not to ask for this. > > Even though, I managed to modify the code, which is simple. > > http://pastebin.com/pBn0TbgK > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Mar 6 07:40:40 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 06 Mar 2014 13:40:40 +0100 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) In-Reply-To: References: Message-ID: <1394109640.9122.13.camel@sebastian-t440> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: > > > > Date: Wed, 05 Mar 2014 17:45:47 +0100 > From: Sebastian Berg > Subject: [Numpy-discussion] Adding weights to cov and corrcoef > To: numpy-discussion at scipy.org > Message-ID: <1394037947.21356.20.camel at sebastian-t440> > Content-Type: text/plain; charset="UTF-8" > > Hi all, > > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol > Dawe > suggested adding new parameters to our `cov` and `corrcoef` > functions to > implement weights, which already exists for `average` (the PR > still > needs to be adapted). > > > Do you mean adopted? > What I meant was that the suggestion isn't actually implemented in the PR at this time. So you can't pull it in to try things out. > > However, we may have missed something obvious, or maybe it is > already > getting too statistical for NumPy, or the keyword argument > might be > better `uncertainties` and `frequencies`. So comments and > insights are > very welcome :). > > > +1 for it being "too baroque" for NumPy--should go in SciPy (if it > isn't already there): IMHO, NumPy should be kept as "lean and mean" as > possible, embellishments are what SciPy is for. (Again, IMO.) > Well, on the other hand, scipy does not actually have a `std` function of its own, I think. So if it is quite useful I think this may be an option (I don't think I ever used weights with std, so I can't argue strongly for inclusion myself). Unless adding new functions to `scipy.stats` (or just statsmodels) which implement different types of weights is the longer term plan, then things might bite... > DG > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From njs at pobox.com Thu Mar 6 08:45:36 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Mar 2014 13:45:36 +0000 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: <747438632415775353.696096sturla.molden-gmail.com@news.gmane.org> References: <747438632415775353.696096sturla.molden-gmail.com@news.gmane.org> Message-ID: On Thu, Mar 6, 2014 at 5:17 AM, Sturla Molden wrote: > Nathaniel Smith wrote: > >> 3. Using Cython in the numpy core >> >> The numpy core contains tons of complicated C code implementing >> elaborate operations like indexing, casting, ufunc dispatch, etc. It >> would be really nice if we could use Cython to write some of these >> things. > > So the idea of having a NumPy as a pure C library in the core is abandoned? This question doesn't make sense to me so I think I must be missing some context. Nothing is abandoned: This is one email by one person on one mailing list suggesting a project to the explore the feasibility of something. And anyway, Cython is just a C code generator, similar in principle to (though vastly more sophisticated than) the ones we already use. It's not like we've ever promised our users we'll keep stable which kind of code generators we use internally. >> However, there is a practical problem: Cython assumes that >> each .pyx file generates a single compiled module with its own >> Cython-defined API. Numpy, however, contains a large number of .c >> files which are all compiled together into a single module, with its >> own home-brewed system for defining the public API. And we can't >> rewrite the whole thing. So for this to be viable, we would need some >> way to compile a bunch of .c *and .pyx* files together into a single >> module, and allow the .c and .pyx files to call each other. > > Cython takes care of that already. > > http://docs.cython.org/src/userguide/sharing_declarations.html#cimport > > http://docs.cython.org/src/userguide/external_C_code.html#using-cython-declarations-from-c Linking multiple .c and .pyx files together into a single .so/.dll is much more complicated than just using 'cimport'. Try it if you don't believe me :-). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Thu Mar 6 08:59:30 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Mar 2014 13:59:30 +0000 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: References: Message-ID: On Thu, Mar 6, 2014 at 9:11 AM, David Cournapeau wrote: > > On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith wrote: >> So this project would have the following goals, depending on how >> practical this turns out to be: (1) produce a hacky proof-of-concept >> system for doing the above, (2) turn the hacky proof-of-concept into >> something actually viable for use in real life (possibly this would >> require getting changes upstream into Cython, etc.), (3) use this >> system to actually port some interesting numpy code into cython. > > > Having to synchronise two projects may be hard for a GSoC, no ? Yeah, if someone is interested in this it would be nice to get someone from Cython involved too. But that's why the primary goal is to produce a proof-of-concept -- even if all that comes out is that we learn that this cannot be done in an acceptable manner, then that's still a succesful (albeit disappointing) result. > Otherwise, I am a bit worried about cython being used on the current C code > as is, because core and python C API are so interwined (especially > multiarray). I don't understand this objection. The whole advantage of Cython is that it makes it much, much easier to write code that involves intertwining complex algorithms and heavy use of the Python C API :-). There's tons of bug-prone spaghetti in numpy for doing boring things like refcounting, exception passing, and argument parsing. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From charlesr.harris at gmail.com Thu Mar 6 12:35:15 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 10:35:15 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett wrote: > Hi, > > On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett > wrote: > > Hi, > > > > I built (and tested) some numpy wheels for the rc1: > > > > http://nipy.bic.berkeley.edu/numpy-dist/ > > Now building, installing, testing, uploading wheels nightly on OSX 10.9: > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 > > and downloading, testing built wheels on OSX 10.6: > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded > > Chuck - are you release manager for this cycle? Would you mind > sending me your public ssh key so I can give you access to the > buildbots for custom builds and so on? > > Cheers, > > Julian has done most of the work for 1.8.1. I did the 1.8.0 release because it needed doing, but building releases isn't my strong point and Ralf actually did the builds for that. So I'll happily send you my ssh, but either Ralph or Julian might be a better bet for getting the work done :) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 6 12:37:27 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 10:37:27 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris wrote: > > > > On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett wrote: > >> Hi, >> >> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett >> wrote: >> > Hi, >> > >> > I built (and tested) some numpy wheels for the rc1: >> > >> > http://nipy.bic.berkeley.edu/numpy-dist/ >> >> Now building, installing, testing, uploading wheels nightly on OSX 10.9: >> >> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 >> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 >> >> and downloading, testing built wheels on OSX 10.6: >> >> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded >> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded >> >> Chuck - are you release manager for this cycle? Would you mind >> sending me your public ssh key so I can give you access to the >> buildbots for custom builds and so on? >> >> Cheers, >> >> > Julian has done most of the work for 1.8.1. I did the 1.8.0 release > because it needed doing, but building releases isn't my strong point and > Ralf actually did the builds for that. So I'll happily send you my ssh, but > either Ralph or Julian might be a better bet for getting the work done :) > > Or, I might add, yourself, if you are interested in taking over that role. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Thu Mar 6 13:46:03 2014 From: jsseabold at gmail.com (Skipper Seabold) Date: Thu, 6 Mar 2014 13:46:03 -0500 Subject: [Numpy-discussion] 1.8.1 release In-Reply-To: References: <1393432035193-36655.post@n7.nabble.com> Message-ID: Hi, Should [1] be considered a release blocker for 1.8.1? Skipper [1] https://github.com/numpy/numpy/issues/4442 From jtaylor.debian at googlemail.com Thu Mar 6 13:51:57 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 06 Mar 2014 19:51:57 +0100 Subject: [Numpy-discussion] 1.8.1 release In-Reply-To: References: <1393432035193-36655.post@n7.nabble.com> Message-ID: <5318C3CD.2040102@googlemail.com> On 06.03.2014 19:46, Skipper Seabold wrote: > Hi, > > Should [1] be considered a release blocker for 1.8.1? > > Skipper > > [1] https://github.com/numpy/numpy/issues/4442 as far as I can tell its a regression of the 1.8.0 release but not the 1.8.1 release so I wouldn't consider it a blocker. But its definitely a very nice to have. Unfortunately it is probably also complicated and invasive to fix as it would need either modifications of nditer or gufuncs (or a revert to non gufunc) which are both quite complicated pieces of code. From jtaylor.debian at googlemail.com Thu Mar 6 13:58:56 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 06 Mar 2014 19:58:56 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: <5315CBD9.5040201@web.de> References: <5315CBD9.5040201@web.de> Message-ID: <5318C570.4020905@googlemail.com> thanks for the report, thix should be fixed with https://github.com/numpy/numpy/pull/4455 which will be in the final 1.8.1 On 04.03.2014 13:49, Thomas Unterthiner wrote: > Hi there! > > I just tried setting up a new installation using numpy 1.8.1rc1 (+scipy > 0.13.3 and matplotlib 1.3.1). I ran into problems when installing > matplotlib 1.3.1. The attached logfile shows the full log, but it ends with: > > src/_png.cpp:329:15: error: ?npy_PyFile_Dup? was not declared in this scope > if ((fp = npy_PyFile_Dup(py_file, "rb"))) > ^ > src/_png.cpp:577:13: error: ?npy_PyFile_DupClose? was not declared in > this scope > if (npy_PyFile_DupClose(py_file, fp)) { > ^ > error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 > > > The problem went away (and matplotlib installed cleanly) when I re-did > the whole shebang using numpy 1.8.0, so I suspect this was caused by > something in the rc. > > Cheers > > Thomas > > > > On 2014-03-03 17:23, Charles R Harris wrote: >> Hi All, >> >> Julian Taylor has put windows binaries and sources for the 1.8.1 >> release candidate up on sourceforge >> . If >> things go well, it will taken to a full release in a week or so. >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From matthew.brett at gmail.com Thu Mar 6 14:05:50 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Mar 2014 11:05:50 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris wrote: > > > > On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris > wrote: >> >> >> >> >> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett >>> wrote: >>> > Hi, >>> > >>> > I built (and tested) some numpy wheels for the rc1: >>> > >>> > http://nipy.bic.berkeley.edu/numpy-dist/ >>> >>> Now building, installing, testing, uploading wheels nightly on OSX 10.9: >>> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 >>> >>> and downloading, testing built wheels on OSX 10.6: >>> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded >>> >>> Chuck - are you release manager for this cycle? Would you mind >>> sending me your public ssh key so I can give you access to the >>> buildbots for custom builds and so on? >>> >>> Cheers, >>> >> >> Julian has done most of the work for 1.8.1. I did the 1.8.0 release >> because it needed doing, but building releases isn't my strong point and >> Ralf actually did the builds for that. So I'll happily send you my ssh, but >> either Ralph or Julian might be a better bet for getting the work done :) >> > > Or, I might add, yourself, if you are interested in taking over that role. I don't know the code well enough to be the release manager, but I'm very happy to do the OSX binary builds. So - release manager VP of OSX maybe? Cheers, Matthew From charlesr.harris at gmail.com Thu Mar 6 14:21:13 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 12:21:13 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 12:05 PM, Matthew Brett wrote: > Hi, > > On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris > wrote: > > > > > > > > On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris > > wrote: > >> > >> > >> > >> > >> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett > >> wrote: > >>> > >>> Hi, > >>> > >>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett > > >>> wrote: > >>> > Hi, > >>> > > >>> > I built (and tested) some numpy wheels for the rc1: > >>> > > >>> > http://nipy.bic.berkeley.edu/numpy-dist/ > >>> > >>> Now building, installing, testing, uploading wheels nightly on OSX > 10.9: > >>> > >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 > >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 > >>> > >>> and downloading, testing built wheels on OSX 10.6: > >>> > >>> > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded > >>> > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded > >>> > >>> Chuck - are you release manager for this cycle? Would you mind > >>> sending me your public ssh key so I can give you access to the > >>> buildbots for custom builds and so on? > >>> > >>> Cheers, > >>> > >> > >> Julian has done most of the work for 1.8.1. I did the 1.8.0 release > >> because it needed doing, but building releases isn't my strong point and > >> Ralf actually did the builds for that. So I'll happily send you my ssh, > but > >> either Ralph or Julian might be a better bet for getting the work done > :) > >> > > > > Or, I might add, yourself, if you are interested in taking over that > role. > > I don't know the code well enough to be the release manager, but I'm > very happy to do the OSX binary builds. So - release manager VP of > OSX maybe? > > That would be helpful. Ralf does those now and I suspect he would welcome the extra hands. The two sites for release builds are Sourceforge and Pypi. I don't know if the wheels builds are good enough/accepted on Pypi, but if you would like permissions on Sourceforge we can extend them to you. We have been trying to do releases for OSX 1.5, which needs a machine running an obsolete OS, but perhaps we should consider dropping that in the future. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Mar 6 14:24:48 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 06 Mar 2014 20:24:48 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: <5318CB80.8040409@googlemail.com> On 06.03.2014 20:05, Matthew Brett wrote: > Hi, > On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris > wrote: >> On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris >> wrote: >>> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett >>> wrote: >>>> Hi, >>>> >>>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett >>>> wrote: >>>>> Hi, >>>>> >>>>> I built (and tested) some numpy wheels for the rc1: >>>>> >>>>> http://nipy.bic.berkeley.edu/numpy-dist/ >>>> >>>> Now building, installing, testing, uploading wheels nightly on OSX 10.9: >>>> >>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 >>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 >>>> >>>> and downloading, testing built wheels on OSX 10.6: >>>> >>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded >>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded >>>> >>>> Chuck - are you release manager for this cycle? Would you mind >>>> sending me your public ssh key so I can give you access to the >>>> buildbots for custom builds and so on? >>>> >>>> Cheers, >>>> >>> >>> Julian has done most of the work for 1.8.1. I did the 1.8.0 release >>> because it needed doing, but building releases isn't my strong point and >>> Ralf actually did the builds for that. So I'll happily send you my ssh, but >>> either Ralph or Julian might be a better bet for getting the work done :) >>> >> >> Or, I might add, yourself, if you are interested in taking over that role. > > I don't know the code well enough to be the release manager, but I'm > very happy to do the OSX binary builds. So - release manager VP of > OSX maybe? > > Cheers, > > Matthew I'm using Ond?ej ?ert?k nice vagrant setup to do the releases, maybe you can have a look if the macos stuff still works with it and updated it for your wheels? https://github.com/juliantaylor/numpy-vendor From d.l.goldsmith at gmail.com Thu Mar 6 14:27:01 2014 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 6 Mar 2014 11:27:01 -0800 Subject: [Numpy-discussion] Adding weights to cov and corrcoef Message-ID: Date: Thu, 06 Mar 2014 13:40:40 +0100 > From: Sebastian Berg > Subject: Re: [Numpy-discussion] Adding weights to cov and corrcoef > (Sebastian Berg) > To: numpy-discussion at scipy.org > Message-ID: <1394109640.9122.13.camel at sebastian-t440> > Content-Type: text/plain; charset="UTF-8" > > On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: > > +1 for it being "too baroque" for NumPy--should go in SciPy (if it > > isn't already there): IMHO, NumPy should be kept as "lean and mean" as > > possible, embellishments are what SciPy is for. (Again, IMO.) > > > > Well, on the other hand, scipy does not actually have a `std` function > of its own, I think. Oh, well, in that case forget I said anything. (Though I think it's "interesting" that no one else has chimed in: if you're the only one that needs it (at this time), perhaps it would be best to "roll your own" and then offer to "pass it around." :-)) DG > So if it is quite useful I think this may be an > option (I don't think I ever used weights with std, so I can't argue > strongly for inclusion myself). Unless adding new functions to > `scipy.stats` (or just statsmodels) which implement different types of > weights is the longer term plan, then things might bite... > > > DG > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > ------------------------------ > > Message: 5 > Date: Thu, 6 Mar 2014 13:45:36 +0000 > From: Nathaniel Smith > Subject: Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc > topic idea: configurable algorithm precision and vector math > library > integration) > To: Discussion of Numerical Python > Message-ID: > R3Fw at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Thu, Mar 6, 2014 at 5:17 AM, Sturla Molden > wrote: > > Nathaniel Smith wrote: > > > >> 3. Using Cython in the numpy core > >> > >> The numpy core contains tons of complicated C code implementing > >> elaborate operations like indexing, casting, ufunc dispatch, etc. It > >> would be really nice if we could use Cython to write some of these > >> things. > > > > So the idea of having a NumPy as a pure C library in the core is > abandoned? > > This question doesn't make sense to me so I think I must be missing > some context. > > Nothing is abandoned: This is one email by one person on one mailing > list suggesting a project to the explore the feasibility of something. > And anyway, Cython is just a C code generator, similar in principle to > (though vastly more sophisticated than) the ones we already use. It's > not like we've ever promised our users we'll keep stable which kind of > code generators we use internally. > > >> However, there is a practical problem: Cython assumes that > >> each .pyx file generates a single compiled module with its own > >> Cython-defined API. Numpy, however, contains a large number of .c > >> files which are all compiled together into a single module, with its > >> own home-brewed system for defining the public API. And we can't > >> rewrite the whole thing. So for this to be viable, we would need some > >> way to compile a bunch of .c *and .pyx* files together into a single > >> module, and allow the .c and .pyx files to call each other. > > > > Cython takes care of that already. > > > > http://docs.cython.org/src/userguide/sharing_declarations.html#cimport > > > > > http://docs.cython.org/src/userguide/external_C_code.html#using-cython-declarations-from-c > > Linking multiple .c and .pyx files together into a single .so/.dll is > much more complicated than just using 'cimport'. Try it if you don't > believe me :-). > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > > ------------------------------ > > Message: 6 > Date: Thu, 6 Mar 2014 13:59:30 +0000 > From: Nathaniel Smith > Subject: Re: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc > topic idea: configurable algorithm precision and vector math > library > integration) > To: Discussion of Numerical Python > Message-ID: > 26izjbJPjg at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Thu, Mar 6, 2014 at 9:11 AM, David Cournapeau > wrote: > > > > On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith wrote: > >> So this project would have the following goals, depending on how > >> practical this turns out to be: (1) produce a hacky proof-of-concept > >> system for doing the above, (2) turn the hacky proof-of-concept into > >> something actually viable for use in real life (possibly this would > >> require getting changes upstream into Cython, etc.), (3) use this > >> system to actually port some interesting numpy code into cython. > > > > > > Having to synchronise two projects may be hard for a GSoC, no ? > > Yeah, if someone is interested in this it would be nice to get someone > from Cython involved too. But that's why the primary goal is to > produce a proof-of-concept -- even if all that comes out is that we > learn that this cannot be done in an acceptable manner, then that's > still a succesful (albeit disappointing) result. > > > Otherwise, I am a bit worried about cython being used on the current C > code > > as is, because core and python C API are so interwined (especially > > multiarray). > > I don't understand this objection. The whole advantage of Cython is > that it makes it much, much easier to write code that involves > intertwining complex algorithms and heavy use of the Python C API :-). > There's tons of bug-prone spaghetti in numpy for doing boring things > like refcounting, exception passing, and argument parsing. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > > ------------------------------ > > Message: 7 > Date: Thu, 6 Mar 2014 10:35:15 -0700 > From: Charles R Harris > Subject: Re: [Numpy-discussion] 1.8.1rc1 on sourceforge. > To: Discussion of Numerical Python > Message-ID: > < > CAB6mnx+btuF3vKxvebfBZKybggp+C6mtnb7zD1ck1bqk9VXV2w at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett >wrote: > > > Hi, > > > > On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett > > wrote: > > > Hi, > > > > > > I built (and tested) some numpy wheels for the rc1: > > > > > > http://nipy.bic.berkeley.edu/numpy-dist/ > > > > Now building, installing, testing, uploading wheels nightly on OSX 10.9: > > > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 > > > > and downloading, testing built wheels on OSX 10.6: > > > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded > > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded > > > > Chuck - are you release manager for this cycle? Would you mind > > sending me your public ssh key so I can give you access to the > > buildbots for custom builds and so on? > > > > Cheers, > > > > > Julian has done most of the work for 1.8.1. I did the 1.8.0 release because > it needed doing, but building releases isn't my strong point and Ralf > actually did the builds for that. So I'll happily send you my ssh, but > either Ralph or Julian might be a better bet for getting the work done :) > > Chuck > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140306/d6534585/attachment.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 90, Issue 13 > ************************************************ > -- >From "A Letter From The Future" in "Peak Everything" by Richard Heinberg: "By the time I was an older teenager, a certain...attitude was developing among the young people...a feeling of utter contempt for anyone over a certain age--maybe 30 or 40. The adults had consumed so many resources, and now there were none left for their own children...when those adults were younger, they [were] just doing what everybody else was doing...they figured it was normal to cut down ancient forests for...phone books, pump every last gallon of oil to power their SUV's...[but] for...my generation all that was just a dim memory...We [grew up] living in darkness, with shortages of food and water, with riots in the streets, with people begging on street corners...for us, the adults were the enemy." Want to *really* understand what's *really* going on? Read "Peak Everything." -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Mar 6 14:38:32 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Mar 2014 11:38:32 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris wrote: > > > > On Thu, Mar 6, 2014 at 12:05 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris >> wrote: >> > >> > >> > >> > On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris >> > wrote: >> >> >> >> >> >> >> >> >> >> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett >> >>> >> >>> wrote: >> >>> > Hi, >> >>> > >> >>> > I built (and tested) some numpy wheels for the rc1: >> >>> > >> >>> > http://nipy.bic.berkeley.edu/numpy-dist/ >> >>> >> >>> Now building, installing, testing, uploading wheels nightly on OSX >> >>> 10.9: >> >>> >> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 >> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 >> >>> >> >>> and downloading, testing built wheels on OSX 10.6: >> >>> >> >>> >> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded >> >>> >> >>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded >> >>> >> >>> Chuck - are you release manager for this cycle? Would you mind >> >>> sending me your public ssh key so I can give you access to the >> >>> buildbots for custom builds and so on? >> >>> >> >>> Cheers, >> >>> >> >> >> >> Julian has done most of the work for 1.8.1. I did the 1.8.0 release >> >> because it needed doing, but building releases isn't my strong point >> >> and >> >> Ralf actually did the builds for that. So I'll happily send you my ssh, >> >> but >> >> either Ralph or Julian might be a better bet for getting the work done >> >> :) >> >> >> > >> > Or, I might add, yourself, if you are interested in taking over that >> > role. >> >> I don't know the code well enough to be the release manager, but I'm >> very happy to do the OSX binary builds. So - release manager VP of >> OSX maybe? >> > > That would be helpful. Ralf does those now and I suspect he would welcome > the extra hands. The two sites for release builds are Sourceforge and Pypi. > I don't know if the wheels builds are good enough/accepted on Pypi, but if > you would like permissions on Sourceforge we can extend them to you. We have > been trying to do releases for OSX 1.5, which needs a machine running an > obsolete OS, but perhaps we should consider dropping that in the future. Ralf - any thoughts? pypi is accepting wheels: http://pythonwheels.com/ https://pypi.python.org/pypi/pyzmq/14.0.1 Chris B - any comments here? As for the numpy wheels specifically - I believe the ones I posted are correct - but I would very much like to get feedback. And - yes please for access to the sourceforge site so I can upload the wheels for testing. I'd recommend dropping 10.5 compatibility and going for 10.6. Apple hasn't updated 10.5 since 2009. For example, Firefox dropped support for it in 2012. I do have a couple of machines running 10.5 if you need it though. Cheers, Matthew From matthew.brett at gmail.com Thu Mar 6 14:43:46 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Mar 2014 11:43:46 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: <5318CB80.8040409@googlemail.com> References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> <5318CB80.8040409@googlemail.com> Message-ID: Hi, On Thu, Mar 6, 2014 at 11:24 AM, Julian Taylor wrote: > On 06.03.2014 20:05, Matthew Brett wrote: >> Hi, >> On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris >> wrote: >>> On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris >>> wrote: >>>> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett >>>> wrote: >>>>> Hi, >>>>> >>>>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I built (and tested) some numpy wheels for the rc1: >>>>>> >>>>>> http://nipy.bic.berkeley.edu/numpy-dist/ >>>>> >>>>> Now building, installing, testing, uploading wheels nightly on OSX 10.9: >>>>> >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 >>>>> >>>>> and downloading, testing built wheels on OSX 10.6: >>>>> >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded >>>>> >>>>> Chuck - are you release manager for this cycle? Would you mind >>>>> sending me your public ssh key so I can give you access to the >>>>> buildbots for custom builds and so on? >>>>> >>>>> Cheers, >>>>> >>>> >>>> Julian has done most of the work for 1.8.1. I did the 1.8.0 release >>>> because it needed doing, but building releases isn't my strong point and >>>> Ralf actually did the builds for that. So I'll happily send you my ssh, but >>>> either Ralph or Julian might be a better bet for getting the work done :) >>>> >>> >>> Or, I might add, yourself, if you are interested in taking over that role. >> >> I don't know the code well enough to be the release manager, but I'm >> very happy to do the OSX binary builds. So - release manager VP of >> OSX maybe? >> >> Cheers, >> >> Matthew > > I'm using Ond?ej ?ert?k nice vagrant setup to do the releases, maybe you > can have a look if the macos stuff still works with it and updated it > for your wheels? > https://github.com/juliantaylor/numpy-vendor Thanks - looking at it now. Is there already a dedicated "Mac build box" somewhere ? If not, I can set one up. Cheers, Matthew From charlesr.harris at gmail.com Thu Mar 6 14:47:39 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 12:47:39 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> <5318CB80.8040409@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 12:43 PM, Matthew Brett wrote: > Hi, > > On Thu, Mar 6, 2014 at 11:24 AM, Julian Taylor > wrote: > > On 06.03.2014 20:05, Matthew Brett wrote: > >> Hi, > >> On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris > >> wrote: > >>> On Thu, Mar 6, 2014 at 10:35 AM, Charles R Harris > >>> wrote: > >>>> On Wed, Mar 5, 2014 at 7:28 PM, Matthew Brett < > matthew.brett at gmail.com> > >>>> wrote: > >>>>> Hi, > >>>>> > >>>>> On Wed, Mar 5, 2014 at 3:29 PM, Matthew Brett < > matthew.brett at gmail.com> > >>>>> wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I built (and tested) some numpy wheels for the rc1: > >>>>>> > >>>>>> http://nipy.bic.berkeley.edu/numpy-dist/ > >>>>> > >>>>> Now building, installing, testing, uploading wheels nightly on OSX > 10.9: > >>>>> > >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7 > >>>>> http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3 > >>>>> > >>>>> and downloading, testing built wheels on OSX 10.6: > >>>>> > >>>>> > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-2.7-downloaded > >>>>> > http://nipy.bic.berkeley.edu/builders/numpy-bdist-whl-osx-3.3-downloaded > >>>>> > >>>>> Chuck - are you release manager for this cycle? Would you mind > >>>>> sending me your public ssh key so I can give you access to the > >>>>> buildbots for custom builds and so on? > >>>>> > >>>>> Cheers, > >>>>> > >>>> > >>>> Julian has done most of the work for 1.8.1. I did the 1.8.0 release > >>>> because it needed doing, but building releases isn't my strong point > and > >>>> Ralf actually did the builds for that. So I'll happily send you my > ssh, but > >>>> either Ralph or Julian might be a better bet for getting the work > done :) > >>>> > >>> > >>> Or, I might add, yourself, if you are interested in taking over that > role. > >> > >> I don't know the code well enough to be the release manager, but I'm > >> very happy to do the OSX binary builds. So - release manager VP of > >> OSX maybe? > >> > >> Cheers, > >> > >> Matthew > > > > I'm using Ond?ej ?ert?k nice vagrant setup to do the releases, maybe you > > can have a look if the macos stuff still works with it and updated it > > for your wheels? > > https://github.com/juliantaylor/numpy-vendor > > Thanks - looking at it now. Is there already a dedicated "Mac build > box" somewhere ? If not, I can set one up. > > What is your sourceforge identity? If you don't have one, please register. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Mar 6 14:51:29 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 6 Mar 2014 19:51:29 +0000 Subject: [Numpy-discussion] Adding weights to cov and corrcoef In-Reply-To: <1394037947.21356.20.camel@sebastian-t440> References: <1394037947.21356.20.camel@sebastian-t440> Message-ID: On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg wrote: > > Hi all, > > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe > suggested adding new parameters to our `cov` and `corrcoef` functions to > implement weights, which already exists for `average` (the PR still > needs to be adapted). > > The idea right now would be to add a `weights` and a `frequencies` > keyword arguments to these functions. > > In more detail: The situation is a bit more complex for `cov` and > `corrcoef` than `average`, because there are different types of weights. > The current plan would be to add two new keyword arguments: > * weights: Uncertainty weights which causes `N` to be recalculated > accordingly (This is R's `cov.wt` default I believe). > * frequencies: When given, `N = sum(frequencies)` and the values > are weighted by their frequency. I don't understand this description at all. One them recalculates N, and the other sets N according to some calculation? Is there a standard reference on how these are supposed to be interpreted? When you talk about per-value uncertainties, I start imagining that we're trying to estimate a population covariance given a set of samples each corrupted by independent measurement noise, and then there's some natural hierarchical Bayesian model one could write down and get an ML estimate of the latent covariance via empirical Bayes or something. But this requires a bunch of assumptions and is that really what we want to do? (Or maybe it collapses down into something simpler if the measurement noise is gaussian or something?) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From argriffi at ncsu.edu Thu Mar 6 15:02:53 2014 From: argriffi at ncsu.edu (alex) Date: Thu, 6 Mar 2014 15:02:53 -0500 Subject: [Numpy-discussion] Adding weights to cov and corrcoef In-Reply-To: References: <1394037947.21356.20.camel@sebastian-t440> Message-ID: On Thu, Mar 6, 2014 at 2:51 PM, Nathaniel Smith wrote: > On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg > wrote: >> >> Hi all, >> >> in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe >> suggested adding new parameters to our `cov` and `corrcoef` functions to >> implement weights, which already exists for `average` (the PR still >> needs to be adapted). >> >> The idea right now would be to add a `weights` and a `frequencies` >> keyword arguments to these functions. >> >> In more detail: The situation is a bit more complex for `cov` and >> `corrcoef` than `average`, because there are different types of weights. >> The current plan would be to add two new keyword arguments: >> * weights: Uncertainty weights which causes `N` to be recalculated >> accordingly (This is R's `cov.wt` default I believe). >> * frequencies: When given, `N = sum(frequencies)` and the values >> are weighted by their frequency. > > I don't understand this description at all. One them recalculates N, > and the other sets N according to some calculation? > > Is there a standard reference on how these are supposed to be > interpreted? When you talk about per-value uncertainties, I start > imagining that we're trying to estimate a population covariance given > a set of samples each corrupted by independent measurement noise, and > then there's some natural hierarchical Bayesian model one could write > down and get an ML estimate of the latent covariance via empirical > Bayes or something. But this requires a bunch of assumptions and is > that really what we want to do? (Or maybe it collapses down into > something simpler if the measurement noise is gaussian or something?) I think the idea is that if you write formulas involving correlation or covariance using matrix notation, then these formulas can be generalized in several different ways by inserting some non-negative or positive diagonal matrices into the formulas in various places. The diagonal entries could be called 'weights'. If they are further restricted to sum to 1 then they could be called 'frequencies'. Or maybe this is too cynical and the jargon has a more standard meaning in this context. From chris.barker at noaa.gov Thu Mar 6 15:32:05 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Mar 2014 12:32:05 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris wrote: > That would be helpful. Ralf does those now and I suspect he would welcome > the extra hands. The two sites for release builds are Sourceforge and Pypi. > I don't know if the wheels builds are good enough/accepted on Pypi, > Would anyone decide that other than this group? > but if you would like permissions on Sourceforge we can extend them to > you. We have been trying to do releases for OSX 1.5, which needs a machine > running an obsolete OS, but perhaps we should consider dropping that in the > future. > Drop that baby! First, it's bit odd -- as I undertand it, the python.org builds support either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a couple years, and 10.6 is getting pretty darn long in the tooth, the only reason to support that older build is for PPC support - I wonder how many folks are still running PPCs? I thought I was one of the hold outs, and I dropped it over a year ago. I'd love to know if it is something that the community still needs to support. And thanks for doing this Matthew! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Mar 6 15:36:46 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Mar 2014 12:36:46 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 11:38 AM, Matthew Brett wrote: > pypi is accepting wheels: > > http://pythonwheels.com/ > https://pypi.python.org/pypi/pyzmq/14.0.1 > > Chris B - any comments here? > It's my understanding that pypi accepts wheels built for the python.orgreleases -- and pip should be able to get the right ones in that case. As far as I know, it's up to the project managers to decide what to put up there. Also, I _think_ that macports, homebrew, and hopefully the Apple builds, won't match to the python.org names, so people won't accidentally get a mis-matched binary wheel. > As for the numpy wheels specifically - I believe the ones I posted are > correct - but I would very much like to get feedback. I, for one will try to test on a couple machines. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Mar 6 15:42:50 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Mar 2014 12:42:50 -0800 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: <747438632415775353.696096sturla.molden-gmail.com@news.gmane.org> References: <747438632415775353.696096sturla.molden-gmail.com@news.gmane.org> Message-ID: On Wed, Mar 5, 2014 at 9:17 PM, Sturla Molden wrote: > we could use Cython to write some of these things. > > So the idea of having a NumPy as a pure C library in the core is abandoned? And at some point, there was the idea of a numpy_core library that could be used entirely independently of cPython. I think Enthought did some work on this for MS, to create a .net numpy, maybe? I do still like that idea.... But there could be a "core" numpy and a "other stuff that is cPython specific" layer than Cython would be great for. -Chris > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Mar 6 15:49:11 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 6 Mar 2014 21:49:11 +0100 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) In-Reply-To: <1394109640.9122.13.camel@sebastian-t440> References: <1394109640.9122.13.camel@sebastian-t440> Message-ID: On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg wrote: > On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: > > > > > > > > Date: Wed, 05 Mar 2014 17:45:47 +0100 > > From: Sebastian Berg > > Subject: [Numpy-discussion] Adding weights to cov and corrcoef > > To: numpy-discussion at scipy.org > > Message-ID: <1394037947.21356.20.camel at sebastian-t440> > > Content-Type: text/plain; charset="UTF-8" > > > > Hi all, > > > > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol > > Dawe > > suggested adding new parameters to our `cov` and `corrcoef` > > functions to > > implement weights, which already exists for `average` (the PR > > still > > needs to be adapted). > > > > > > Do you mean adopted? > > > > What I meant was that the suggestion isn't actually implemented in the > PR at this time. So you can't pull it in to try things out. > > > > > However, we may have missed something obvious, or maybe it is > > already > > getting too statistical for NumPy, or the keyword argument > > might be > > better `uncertainties` and `frequencies`. So comments and > > insights are > > very welcome :). > > > > > > +1 for it being "too baroque" for NumPy--should go in SciPy (if it > > isn't already there): IMHO, NumPy should be kept as "lean and mean" as > > possible, embellishments are what SciPy is for. (Again, IMO.) > > > > Well, on the other hand, scipy does not actually have a `std` function > of its own, I think. So if it is quite useful I think this may be an > option (I don't think I ever used weights with std, so I can't argue > strongly for inclusion myself). Unless adding new functions to > `scipy.stats` (or just statsmodels) which implement different types of > weights is the longer term plan, then things might bite... > AFAIK there's currently no such plan. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 6 16:10:23 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 14:10:23 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 1:32 PM, Chris Barker wrote: > On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> That would be helpful. Ralf does those now and I suspect he would welcome >> the extra hands. The two sites for release builds are Sourceforge and Pypi. >> I don't know if the wheels builds are good enough/accepted on Pypi, >> > > Would anyone decide that other than this group? > > >> but if you would like permissions on Sourceforge we can extend them to >> you. We have been trying to do releases for OSX 1.5, which needs a machine >> running an obsolete OS, but perhaps we should consider dropping that in the >> future. >> > > Drop that baby! > > First, it's bit odd -- as I undertand it, the python.org builds support > either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a > couple years, and 10.6 is getting pretty darn long in the tooth, the only > reason to support that older build is for PPC support - I wonder how many > folks are still running PPCs? I thought I was one of the hold outs, and I > dropped it over a year ago. I'd love to know if it is something that the > community still needs to support. > > Now that I look on sourceforge, I don't see any OS X 10.5 builds, they are all 10.6+. So that bit of support seems to have dropped in reality, if not officially. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Mar 6 16:14:25 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 6 Mar 2014 13:14:25 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Thu, Mar 6, 2014 at 12:36 PM, Chris Barker wrote: > On Thu, Mar 6, 2014 at 11:38 AM, Matthew Brett > wrote: >> >> pypi is accepting wheels: >> >> http://pythonwheels.com/ >> https://pypi.python.org/pypi/pyzmq/14.0.1 >> >> Chris B - any comments here? > > > It's my understanding that pypi accepts wheels built for the python.org > releases -- and pip should be able to get the right ones in that case. > > As far as I know, it's up to the project managers to decide what to put up > there. > > Also, I _think_ that macports, homebrew, and hopefully the Apple builds, > won't match to the python.org names, so people won't accidentally get a > mis-matched binary wheel. I believe that the wheels built against python.org python will in any case work with system python. I've just tested the wheel I built [1] on a 10.7 machine in a system python virtualenv - all tests pass. In any case, unless we do something extra, the built wheel won't install into system python by default, because the wheel name can't match the name system python expects. Here I tested the situation I'd expect when the wheel is on pypi, by downloading the wheel to the current directory of a 10.7 machine and: pip install --pre --find-links . numpy pip doesn't accept the wheel and starts a source install. This is because of the platform tag [2]. System python expects a platform tag that matches the result of `distutils.util.get_platform()`. The python.org builds always have `10_6_intel` for this. On a 10.7 machine: $ /Library/Frameworks/Python.framework/Versions/2.7/bin/python -c "import distutils.util; print(distutils.util.get_platform())" macosx-10.6-intel $ /Library/Frameworks/Python.framework/Versions/3.3/bin/python3 -c "import distutils.util; print(distutils.util.get_platform())" macosx-10.6-intel System python has the actual OSX version. On 10.7 again: $ /usr/bin/python -c "import distutils.util; print(distutils.util.get_platform())" macosx-10.7-intel On 10.9: $ /usr/bin/python -c "import distutils.util; print(distutils.util.get_platform())" macosx-10.9-intel In fact, if I rename my wheel from `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` to `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.macosx_10_7_intel.whl`, system python will pick up this wheel, but obviously this could get boring for lots of OSX versions, and in any case, it's not really our target market for the wheels. [4] Min RK actually has a pull request in to relax this OSX version specificity [3] because the wheels should be (and seem to be) interoperable, but the take-home is that we're not likely to run into trouble with system python. Cheers, Matthew [1] http://nipy.bic.berkeley.edu/numpy-dist/numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl [2] http://legacy.python.org/dev/peps/pep-0425/ [3] https://github.com/pypa/pip/pull/1465 [4] http://legacy.python.org/dev/peps/pep-0425/#compressed-tag-sets From charlesr.harris at gmail.com Thu Mar 6 16:17:18 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 6 Mar 2014 14:17:18 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 2:10 PM, Charles R Harris wrote: > > > > On Thu, Mar 6, 2014 at 1:32 PM, Chris Barker wrote: > >> On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> That would be helpful. Ralf does those now and I suspect he would >>> welcome the extra hands. The two sites for release builds are Sourceforge >>> and Pypi. I don't know if the wheels builds are good enough/accepted on >>> Pypi, >>> >> >> Would anyone decide that other than this group? >> >> >>> but if you would like permissions on Sourceforge we can extend them to >>> you. We have been trying to do releases for OSX 1.5, which needs a machine >>> running an obsolete OS, but perhaps we should consider dropping that in the >>> future. >>> >> >> Drop that baby! >> >> First, it's bit odd -- as I undertand it, the python.org builds support >> either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a >> couple years, and 10.6 is getting pretty darn long in the tooth, the only >> reason to support that older build is for PPC support - I wonder how many >> folks are still running PPCs? I thought I was one of the hold outs, and I >> dropped it over a year ago. I'd love to know if it is something that the >> community still needs to support. >> >> > Now that I look on sourceforge, I don't see any OS X 10.5 builds, they are > all 10.6+. So that bit of support seems to have dropped in reality, if not > officially. > > The last release to support earlier than that was 1.7.1, which supported 10.3 and that has 643 downloads total. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Mar 6 16:30:56 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 Mar 2014 16:30:56 -0500 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) In-Reply-To: References: <1394109640.9122.13.camel@sebastian-t440> Message-ID: On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers wrote: > > > > On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg > wrote: >> >> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: >> > >> > >> > >> > Date: Wed, 05 Mar 2014 17:45:47 +0100 >> > From: Sebastian Berg >> > Subject: [Numpy-discussion] Adding weights to cov and corrcoef >> > To: numpy-discussion at scipy.org >> > Message-ID: <1394037947.21356.20.camel at sebastian-t440> >> > Content-Type: text/plain; charset="UTF-8" >> > >> > Hi all, >> > >> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol >> > Dawe >> > suggested adding new parameters to our `cov` and `corrcoef` >> > functions to >> > implement weights, which already exists for `average` (the PR >> > still >> > needs to be adapted). >> > >> > >> > Do you mean adopted? >> > >> >> What I meant was that the suggestion isn't actually implemented in the >> PR at this time. So you can't pull it in to try things out. >> >> > >> > However, we may have missed something obvious, or maybe it is >> > already >> > getting too statistical for NumPy, or the keyword argument >> > might be >> > better `uncertainties` and `frequencies`. So comments and >> > insights are >> > very welcome :). >> > >> > >> > +1 for it being "too baroque" for NumPy--should go in SciPy (if it >> > isn't already there): IMHO, NumPy should be kept as "lean and mean" as >> > possible, embellishments are what SciPy is for. (Again, IMO.) >> > >> >> Well, on the other hand, scipy does not actually have a `std` function >> of its own, I think. So if it is quite useful I think this may be an >> option (I don't think I ever used weights with std, so I can't argue >> strongly for inclusion myself). Unless adding new functions to >> `scipy.stats` (or just statsmodels) which implement different types of >> weights is the longer term plan, then things might bite... > > > AFAIK there's currently no such plan. since numpy has taken over all the basic statistics, var, std, cov, corrcoef, and scipy.stats dropped those, I don't see any reason to resurrect them. The only question IMO is which ddof for weighted std, ... statsmodels has the basic statistics with frequency weights, but they are largely in support of t-test and similar hypothesis tests. Josef > > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From chris.barker at noaa.gov Thu Mar 6 17:10:52 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Mar 2014 14:10:52 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 1:14 PM, Matthew Brett wrote: > I believe that the wheels built against python.org python will in any > case work with system python. > IIUC, the system python is built agains an up-to-date SDK. so it wouldn't run on an older OS version -- and why would anyone want it to -- it comes with the system. Our wheels are built with the 10.6 SDK -- OS-X is supposed to be backward compatible, so 10.6 SDK code will run on newer OS versions -- but could there be any clashes if a shared lib is linked in that uses a different SDK than the host application? I have no idea if this is expected to be robust -- though the linker doesn't given an error -- so maybe. I've just tested the wheel I built [1] on a 10.7 machine in a system > python virtualenv - all tests pass. > Good start. > In any case, unless we do something extra, the built wheel won't > install into system python by default, because the wheel name can't > match the name system python expects. Here I tested the situation I'd > expect when the wheel is on pypi, by downloading the wheel to the > current directory of a 10.7 machine and: > > pip install --pre --find-links . numpy > > pip doesn't accept the wheel and starts a source install. This is > because of the platform tag [2]. System python expects a platform tag > that matches the result of `distutils.util.get_platform()`. The > python.org builds always have `10_6_intel` for this. On a 10.7 > machine: > > System python has the actual OSX version. On 10.7 again: > > $ /usr/bin/python -c "import distutils.util; > print(distutils.util.get_platform())" > macosx-10.7-intel > interesting -- the "intel" part of that means is SHOULD be a universal binary with 32 and 64 bit in there. And on my 10.7 system, it is. So maybe this should work. In fact, if I rename my wheel from > `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` to > `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.macosx_10_7_intel.whl`, > system python will pick up this wheel, but obviously this could get > boring for lots of OSX versions, and in any case, it's not really our > target market for the wheels. [4] > Exactly, though if we can support it easily, maybe good to do. Min RK actually has a pull request in to relax this OSX version > specificity [3] because the wheels should be (and seem to be) > interoperable, but the take-home is that we're not likely to run into > trouble with system python. > If we relax things too much, will we also get homebrew and macports and built-it-myself pythons, and will they work? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Thu Mar 6 19:20:05 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 7 Mar 2014 00:20:05 +0000 (UTC) Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) References: <1394109640.9122.13.camel@sebastian-t440> Message-ID: <990860791415844251.439708sturla.molden-gmail.com@news.gmane.org> wrote: > The only question IMO is which ddof for weighted std, ... Something like this? sum_weights - (ddof/float(n))*sum_weights Sturla From sebastian at sipsolutions.net Thu Mar 6 19:32:12 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 07 Mar 2014 01:32:12 +0100 Subject: [Numpy-discussion] Adding weights to cov and corrcoef In-Reply-To: References: <1394037947.21356.20.camel@sebastian-t440> Message-ID: <1394152332.2916.35.camel@sebastian-t440> On Do, 2014-03-06 at 19:51 +0000, Nathaniel Smith wrote: > On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg > wrote: > > > > Hi all, > > > > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe > > suggested adding new parameters to our `cov` and `corrcoef` functions to > > implement weights, which already exists for `average` (the PR still > > needs to be adapted). > > > > The idea right now would be to add a `weights` and a `frequencies` > > keyword arguments to these functions. > > > > In more detail: The situation is a bit more complex for `cov` and > > `corrcoef` than `average`, because there are different types of weights. > > The current plan would be to add two new keyword arguments: > > * weights: Uncertainty weights which causes `N` to be recalculated > > accordingly (This is R's `cov.wt` default I believe). > > * frequencies: When given, `N = sum(frequencies)` and the values > > are weighted by their frequency. > > I don't understand this description at all. One them recalculates N, > and the other sets N according to some calculation? > > Is there a standard reference on how these are supposed to be > interpreted? When you talk about per-value uncertainties, I start > imagining that we're trying to estimate a population covariance given > a set of samples each corrupted by independent measurement noise, and > then there's some natural hierarchical Bayesian model one could write > down and get an ML estimate of the latent covariance via empirical > Bayes or something. But this requires a bunch of assumptions and is > that really what we want to do? (Or maybe it collapses down into > something simpler if the measurement noise is gaussian or something?) > I had really hoped someone who knows this stuff very well would show up ;). I think these weights were uncertainties under gaussian assumption and the other types of weights different, see `aweights` here: http://www.stata.com/support/faqs/statistics/weights-and-summary-statistics/, but I did not check a statistics book or have one here right now (e.g. wikipedia is less than helpful). Frankly unless there is some "obviously right" thing (for a statistician), I would be careful add such new features. And while I thought before that this might be the case, it isn't clear to me. - Sebastian > -n > From sebastian at sipsolutions.net Thu Mar 6 19:43:15 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 07 Mar 2014 01:43:15 +0100 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) In-Reply-To: References: <1394109640.9122.13.camel@sebastian-t440> Message-ID: <1394152995.2916.42.camel@sebastian-t440> On Do, 2014-03-06 at 16:30 -0500, josef.pktd at gmail.com wrote: > On Thu, Mar 6, 2014 at 3:49 PM, Ralf Gommers wrote: > > > > > > > > On Thu, Mar 6, 2014 at 1:40 PM, Sebastian Berg > > wrote: > >> > >> On Mi, 2014-03-05 at 10:21 -0800, David Goldsmith wrote: > >> > > >> > > >> > > >> > Date: Wed, 05 Mar 2014 17:45:47 +0100 > >> > From: Sebastian Berg > >> > Subject: [Numpy-discussion] Adding weights to cov and corrcoef > >> > To: numpy-discussion at scipy.org > >> > Message-ID: <1394037947.21356.20.camel at sebastian-t440> > >> > Content-Type: text/plain; charset="UTF-8" > >> > > >> > Hi all, > >> > > >> > in Pull Request https://github.com/numpy/numpy/pull/3864 Neol > >> > Dawe > >> > suggested adding new parameters to our `cov` and `corrcoef` > >> > functions to > >> > implement weights, which already exists for `average` (the PR > >> > still > >> > needs to be adapted). > >> > > >> > > >> > Do you mean adopted? > >> > > >> > >> What I meant was that the suggestion isn't actually implemented in the > >> PR at this time. So you can't pull it in to try things out. > >> > >> > > >> > However, we may have missed something obvious, or maybe it is > >> > already > >> > getting too statistical for NumPy, or the keyword argument > >> > might be > >> > better `uncertainties` and `frequencies`. So comments and > >> > insights are > >> > very welcome :). > >> > > >> > > >> > +1 for it being "too baroque" for NumPy--should go in SciPy (if it > >> > isn't already there): IMHO, NumPy should be kept as "lean and mean" as > >> > possible, embellishments are what SciPy is for. (Again, IMO.) > >> > > >> > >> Well, on the other hand, scipy does not actually have a `std` function > >> of its own, I think. So if it is quite useful I think this may be an > >> option (I don't think I ever used weights with std, so I can't argue > >> strongly for inclusion myself). Unless adding new functions to > >> `scipy.stats` (or just statsmodels) which implement different types of > >> weights is the longer term plan, then things might bite... > > > > > > AFAIK there's currently no such plan. > > since numpy has taken over all the basic statistics, var, std, cov, > corrcoef, and scipy.stats dropped those, I don't see any reason to > resurrect them. > > The only question IMO is which ddof for weighted std, ... > I am right now a bit unsure about whether or not the "weights" would be "aweights" or different... R seems to not care about the scale of the weights which seems a bit odd to me for an unbiased estimator? I always assumed that we can do the statistics behind using the ddof... But even if we can figure out the right way, what I am doubting a bit is that if we add weights, their names should be clear enough to not clash with possibly different kind of (interesting) weights in other functions. > statsmodels has the basic statistics with frequency weights, but they > are largely in support of t-test and similar hypothesis tests. > > Josef > > > > > > Ralf > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From sturla.molden at gmail.com Thu Mar 6 20:38:27 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 7 Mar 2014 01:38:27 +0000 (UTC) Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) References: <1394109640.9122.13.camel@sebastian-t440> <1394152995.2916.42.camel@sebastian-t440> Message-ID: <331375220415849077.231914sturla.molden-gmail.com@news.gmane.org> Sebastian Berg wrote: > I am right now a bit unsure about whether or not the "weights" would be > "aweights" or different... R seems to not care about the scale of the > weights which seems a bit odd to me for an unbiased estimator? I always > assumed that we can do the statistics behind using the ddof... But even > if we can figure out the right way, what I am doubting a bit is that if > we add weights, their names should be clear enough to not clash with > possibly different kind of (interesting) weights in other functions. http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance From sturla.molden at gmail.com Thu Mar 6 20:50:35 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 7 Mar 2014 01:50:35 +0000 (UTC) Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) References: <1394109640.9122.13.camel@sebastian-t440> <990860791415844251.439708sturla.molden-gmail.com@news.gmane.org> Message-ID: <547770212415849718.853634sturla.molden-gmail.com@news.gmane.org> Sturla Molden wrote: > wrote: > >> The only question IMO is which ddof for weighted std, ... > > Something like this? > > sum_weights - (ddof/float(n))*sum_weights Please ignore. From josef.pktd at gmail.com Thu Mar 6 23:32:50 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 6 Mar 2014 23:32:50 -0500 Subject: [Numpy-discussion] Adding weights to cov and corrcoef (Sebastian Berg) In-Reply-To: <331375220415849077.231914sturla.molden-gmail.com@news.gmane.org> References: <1394109640.9122.13.camel@sebastian-t440> <1394152995.2916.42.camel@sebastian-t440> <331375220415849077.231914sturla.molden-gmail.com@news.gmane.org> Message-ID: On Thu, Mar 6, 2014 at 8:38 PM, Sturla Molden wrote: > Sebastian Berg wrote: > >> I am right now a bit unsure about whether or not the "weights" would be >> "aweights" or different... R seems to not care about the scale of the >> weights which seems a bit odd to me for an unbiased estimator? I always >> assumed that we can do the statistics behind using the ddof... But even >> if we can figure out the right way, what I am doubting a bit is that if >> we add weights, their names should be clear enough to not clash with >> possibly different kind of (interesting) weights in other functions. > > http://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_covariance just as additional motivation (I'm not into definition of weights right now :) I was just reading a chapter on robust covariance estimation, and one of the steps in many of the procedures requires weighted covariances, and weighted variances. weights are just to reduce the influence of outlying observations. Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From josef.pktd at gmail.com Fri Mar 7 00:06:29 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Mar 2014 00:06:29 -0500 Subject: [Numpy-discussion] Adding weights to cov and corrcoef In-Reply-To: References: <1394037947.21356.20.camel@sebastian-t440> Message-ID: On Thu, Mar 6, 2014 at 2:51 PM, Nathaniel Smith wrote: > On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg > wrote: >> >> Hi all, >> >> in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe >> suggested adding new parameters to our `cov` and `corrcoef` functions to >> implement weights, which already exists for `average` (the PR still >> needs to be adapted). >> >> The idea right now would be to add a `weights` and a `frequencies` >> keyword arguments to these functions. >> >> In more detail: The situation is a bit more complex for `cov` and >> `corrcoef` than `average`, because there are different types of weights. >> The current plan would be to add two new keyword arguments: >> * weights: Uncertainty weights which causes `N` to be recalculated >> accordingly (This is R's `cov.wt` default I believe). >> * frequencies: When given, `N = sum(frequencies)` and the values >> are weighted by their frequency. > > I don't understand this description at all. One them recalculates N, > and the other sets N according to some calculation? > > Is there a standard reference on how these are supposed to be > interpreted? When you talk about per-value uncertainties, I start > imagining that we're trying to estimate a population covariance given > a set of samples each corrupted by independent measurement noise, and > then there's some natural hierarchical Bayesian model one could write > down and get an ML estimate of the latent covariance via empirical > Bayes or something. But this requires a bunch of assumptions and is > that really what we want to do? (Or maybe it collapses down into > something simpler if the measurement noise is gaussian or something?) In general, going mostly based on Stata frequency weights are just a shortcut if you have repeated observations. In my unit tests, the results is the same as using np.repeat IIRC. The total number of observation is the sum of weights. aweights and pweights are mainly like weights in WLS, reflecting the uncertainty of each observation. The number of observations is equal to the number of rows. (Stata internally rescales the weights) one explanation is that observations are measured with different noise, another that observations represent the mean of subsamples with different number of observations. there is an additional degrees of freedom correction in one of the proposed calculations modeled after other packages that I never figured out. (aside: statsmodels does not normalize the scale in WLS, in contrast to Stata, and it is now equivalent to GLS with diagonal sigma. The meaning of weight=1 depends on the user. nobs is number of rows.) no Bayesian analysis involved. but I guess someone could come up with a Bayesian interpretation. I think the two proposed weight types, weights and frequencies, should be able to handle almost all cases. Josef > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From cournape at gmail.com Fri Mar 7 05:49:05 2014 From: cournape at gmail.com (David Cournapeau) Date: Fri, 7 Mar 2014 10:49:05 +0000 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: References: Message-ID: On Thu, Mar 6, 2014 at 1:59 PM, Nathaniel Smith wrote: > On Thu, Mar 6, 2014 at 9:11 AM, David Cournapeau > wrote: > > > > On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith wrote: > >> So this project would have the following goals, depending on how > >> practical this turns out to be: (1) produce a hacky proof-of-concept > >> system for doing the above, (2) turn the hacky proof-of-concept into > >> something actually viable for use in real life (possibly this would > >> require getting changes upstream into Cython, etc.), (3) use this > >> system to actually port some interesting numpy code into cython. > > > > > > Having to synchronise two projects may be hard for a GSoC, no ? > > Yeah, if someone is interested in this it would be nice to get someone > from Cython involved too. But that's why the primary goal is to > produce a proof-of-concept -- even if all that comes out is that we > learn that this cannot be done in an acceptable manner, then that's > still a succesful (albeit disappointing) result. > > > Otherwise, I am a bit worried about cython being used on the current C > code > > as is, because core and python C API are so interwined (especially > > multiarray). > > I don't understand this objection. The whole advantage of Cython is > that it makes it much, much easier to write code that involves > intertwining complex algorithms and heavy use of the Python C API :-). There's tons of bug-prone spaghetti in numpy for doing boring things > like refcounting, exception passing, and argument parsing. > No argument there, doing refcounting, etc.. manually is a waste of time. Ideally, cython would be used for the boring stuff, and we keep C for the low-level machinery, but the current code don't cleanly separate those two layers (there is simple C API for indexing, ufunc, etc...). I am concerned about cython making that difference even more blurry. David > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 7 10:39:08 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 7 Mar 2014 10:39:08 -0500 Subject: [Numpy-discussion] Adding weights to cov and corrcoef In-Reply-To: References: <1394037947.21356.20.camel@sebastian-t440> Message-ID: On Fri, Mar 7, 2014 at 12:06 AM, wrote: > On Thu, Mar 6, 2014 at 2:51 PM, Nathaniel Smith wrote: >> On Wed, Mar 5, 2014 at 4:45 PM, Sebastian Berg >> wrote: >>> >>> Hi all, >>> >>> in Pull Request https://github.com/numpy/numpy/pull/3864 Neol Dawe >>> suggested adding new parameters to our `cov` and `corrcoef` functions to >>> implement weights, which already exists for `average` (the PR still >>> needs to be adapted). >>> >>> The idea right now would be to add a `weights` and a `frequencies` >>> keyword arguments to these functions. >>> >>> In more detail: The situation is a bit more complex for `cov` and >>> `corrcoef` than `average`, because there are different types of weights. >>> The current plan would be to add two new keyword arguments: >>> * weights: Uncertainty weights which causes `N` to be recalculated >>> accordingly (This is R's `cov.wt` default I believe). >>> * frequencies: When given, `N = sum(frequencies)` and the values >>> are weighted by their frequency. >> >> I don't understand this description at all. One them recalculates N, >> and the other sets N according to some calculation? >> >> Is there a standard reference on how these are supposed to be >> interpreted? When you talk about per-value uncertainties, I start >> imagining that we're trying to estimate a population covariance given >> a set of samples each corrupted by independent measurement noise, and >> then there's some natural hierarchical Bayesian model one could write >> down and get an ML estimate of the latent covariance via empirical >> Bayes or something. But this requires a bunch of assumptions and is >> that really what we want to do? (Or maybe it collapses down into >> something simpler if the measurement noise is gaussian or something?) > > In general, going mostly based on Stata > > frequency weights are just a shortcut if you have repeated > observations. In my unit tests, the results is the same as using > np.repeat IIRC. The total number of observation is the sum of weights. > > aweights and pweights are mainly like weights in WLS, reflecting the > uncertainty of each observation. The number of observations is equal > to the number of rows. (Stata internally rescales the weights) > one explanation is that observations are measured with different > noise, another that observations represent the mean of subsamples with > different number of observations. > > there is an additional degrees of freedom correction in one of the > proposed calculations modeled after other packages that I never > figured out. I found the missing proof http://stats.stackexchange.com/questions/47325/bias-correction-in-weighted-variance Josef > > (aside: statsmodels does not normalize the scale in WLS, in contrast > to Stata, and it is now equivalent to GLS with diagonal sigma. The > meaning of weight=1 depends on the user. nobs is number of rows.) > > no Bayesian analysis involved. but I guess someone could come up with > a Bayesian interpretation. > > I think the two proposed weight types, weights and frequencies, should > be able to handle almost all cases. > > Josef > >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion From pierre.haessig at crans.org Fri Mar 7 11:31:48 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 07 Mar 2014 17:31:48 +0100 Subject: [Numpy-discussion] numpy apply_along_axis named arguments In-Reply-To: <5318594E.1020401@ic3.cat> References: <5318594E.1020401@ic3.cat> Message-ID: <5319F474.6020006@crans.org> Hi, Le 06/03/2014 12:17, Albert Jornet Puig a ?crit : > I am working with *apply_along_axis* method and I would like to apply > a method that requires to pass named arguments > (scipy.stats.mstats.mquantiles with prob[]). But currently, it is not > possible with *apply_along_axis*. > > I wonder if it would make sense to add the possibility to pass named > arguments. I am also aware that It could be implement using other ways > (a loop for each row). That's why I would like to ask whether it makes > sense or not to ask for this. I see two alternatives (of course I don't know if it would work for your particular situation) 1) a wrapper function to make extra arguments disappear, like : my_quantiles = lambda a: scipy.stats.mstats.mquantiles(a, prob=[0.25]) 2) or use directly the `axis` argument of scipy.stats.mstats.mquantiles (that's the simplest, if possible with your usecase) best, Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 880 bytes Desc: OpenPGP digital signature URL: From charlesr.harris at gmail.com Fri Mar 7 18:26:43 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 7 Mar 2014 16:26:43 -0700 Subject: [Numpy-discussion] numpy gsoc ideas (was: numpy gsoc topic idea: configurable algorithm precision and vector math library integration) In-Reply-To: References: Message-ID: On Wed, Mar 5, 2014 at 2:11 PM, Nathaniel Smith wrote: > On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor > wrote: > > hi, > > > > as the numpy gsoc topic page is a little short on options I was thinking > > about adding two topics for interested students. But as I have no > > experience with gsoc or mentoring and the ideas are not very fleshed out > > yet I'd like to ask if it might make sense at all: > > > > 1. configurable algorithm precision > [...] > > with np.precmode(default="fast"): > > np.abs(complex_array) > > > > or fast everything except sum and hypot > > > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > > np.sum(d) > [...] > > Not a big fan of this one -- it seems like the biggest bulk of the > effort would be in figuring out a non-horrible API for exposing these > things and getting consensus around it, which is not a good fit to the > SoC structure. > > I'm pretty nervous about the datetime proposal that's currently on the > wiki, for similar reasons -- I'm not sure it's actually doable in the > SoC context. > > > 2. vector math library integration > > This is a great suggestion -- clear scope, clear benefit. > > Two more ideas: > > 3. Using Cython in the numpy core > > The numpy core contains tons of complicated C code implementing > elaborate operations like indexing, casting, ufunc dispatch, etc. It > would be really nice if we could use Cython to write some of these > things. However, there is a practical problem: Cython assumes that > each .pyx file generates a single compiled module with its own > Cython-defined API. Numpy, however, contains a large number of .c > files which are all compiled together into a single module, with its > own home-brewed system for defining the public API. And we can't > rewrite the whole thing. So for this to be viable, we would need some > way to compile a bunch of .c *and .pyx* files together into a single > module, and allow the .c and .pyx files to call each other. This might > involve changes to Cython, some sort of clever post-processing or glue > code to get existing cython-generated source code to play nicely with > the rest of numpy, or something else. > > So this project would have the following goals, depending on how > practical this turns out to be: (1) produce a hacky proof-of-concept > system for doing the above, (2) turn the hacky proof-of-concept into > something actually viable for use in real life (possibly this would > require getting changes upstream into Cython, etc.), (3) use this > system to actually port some interesting numpy code into cython. > > If I were to rewrite some Numpy C code in Cython, I'd try _compiled_base.c first. > 4. Pythonic dtypes > > The current dtype system is klugey. It basically defines its own class > system, in parallel to Python's, and unsurprisingly, this new class > system is not as good. In particular, it has limitations around the > storage of instance-specific data which rule out a large variety of > interesting user-defined dtypes, and causes us to need some truly > nasty hacks to support the built-in dtypes we do have. And it makes > defining a new dtype much more complicated than defining a new Python > class. > > This project would be to implement a new dtype system for numpy, in > which np.dtype becomes a near-empty base class, different dtypes > (e.g., float64, float32) are simply different subclasses of np.dtype, > and dtype objects are simply instances of these classes. Further > enhancements would be to make it possible to define new dtypes in pure > Python by subclassing np.dtype and implementing special methods for > the various dtype operations, and to make it possible for ufunc loops > to see the dtype objects. > > This project would provide the key enabling piece for a wide variety > of interesting new features: missing value support, better handling of > strings and categorical data, unit handling, automatic > differentiation, and probably a bunch more I'm forgetting right now. > > If we get someone who's up to handling the dtype thing then I can > mentor or co-mentor. > > What do y'all think? > > (I don't think I have access to update that wiki page -- or maybe I'm > just not clever enough to figure out how -- so it would be helpful if > someone who can, could?) > > Another possibility would be plugin random number generators for numpy.random. That would require a student with good deal of expertise though, design takes more experience than coding. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 8 04:50:07 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Mar 2014 10:50:07 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 8:38 PM, Matthew Brett wrote: > Hi, > > On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris > wrote: > > > On Thu, Mar 6, 2014 at 12:05 PM, Matthew Brett > > wrote: > >> > >> On Thu, Mar 6, 2014 at 9:37 AM, Charles R Harris > >> wrote: > > >> >> Julian has done most of the work for 1.8.1. I did the 1.8.0 release > >> >> because it needed doing, but building releases isn't my strong point > >> >> and > >> >> Ralf actually did the builds for that. So I'll happily send you my > ssh, > >> >> but > >> >> either Ralph or Julian might be a better bet for getting the work > done > >> >> :) > >> >> > >> > > >> > Or, I might add, yourself, if you are interested in taking over that > >> > role. > >> > >> I don't know the code well enough to be the release manager, but I'm > >> very happy to do the OSX binary builds. So - release manager VP of > >> OSX maybe? > >> > > > > That would be helpful. Ralf does those now and I suspect he would welcome > > the extra hands. > He would:) > The two sites for release builds are Sourceforge and Pypi. > > I don't know if the wheels builds are good enough/accepted on Pypi, but > if > > you would like permissions on Sourceforge we can extend them to you. We > have > > been trying to do releases for OSX 1.5, which needs a machine running an > > obsolete OS, but perhaps we should consider dropping that in the future. > > Ralf - any thoughts? > > pypi is accepting wheels: > > http://pythonwheels.com/ > https://pypi.python.org/pypi/pyzmq/14.0.1 > We tried once to put wheels on SF without much response, and if we put them on testpypi I don't expect much more. Since the wheels appear to work, let's just put them on PyPi and fix possible issues if and when they show up. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 8 04:56:31 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Mar 2014 10:56:31 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 10:10 PM, Charles R Harris wrote: > > > > On Thu, Mar 6, 2014 at 1:32 PM, Chris Barker wrote: > >> On Thu, Mar 6, 2014 at 11:21 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> That would be helpful. Ralf does those now and I suspect he would >>> welcome the extra hands. The two sites for release builds are Sourceforge >>> and Pypi. I don't know if the wheels builds are good enough/accepted on >>> Pypi, >>> >> >> Would anyone decide that other than this group? >> > No, that's up to us. > >> >>> but if you would like permissions on Sourceforge we can extend them to >>> you. We have been trying to do releases for OSX 1.5, which needs a machine >>> running an obsolete OS, but perhaps we should consider dropping that in the >>> future. >>> >> >> Drop that baby! >> > +1 > >> First, it's bit odd -- as I undertand it, the python.org builds support >> either 10.3.9 + or 10.6+. As 10.5 has not been supported for Apple for a >> couple years, and 10.6 is getting pretty darn long in the tooth, the only >> reason to support that older build is for PPC support - I wonder how many >> folks are still running PPCs? I thought I was one of the hold outs, and I >> dropped it over a year ago. I'd love to know if it is something that the >> community still needs to support. >> >> > Now that I look on sourceforge, I don't see any OS X 10.5 builds, they are > all 10.6+. So that bit of support seems to have dropped in reality, if not > officially. > For RCs I usually don't bother, because the setup is a bit awkward (the 10.5 machine sits in Vincent's basement and he needs to start up for me) and there are too few testers for those to get useful feedback anyway. For 1.8.0 I planned to get back to that after uploading the 10.6 ones, but forgot. Because no one noticed until now, it looks like spending effort on 10.5 support isn't all that useful. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Mar 8 05:22:08 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 8 Mar 2014 11:22:08 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Thu, Mar 6, 2014 at 11:10 PM, Chris Barker wrote: > On Thu, Mar 6, 2014 at 1:14 PM, Matthew Brett wrote: > >> I believe that the wheels built against python.org python will in any >> case work with system python. >> > > IIUC, the system python is built agains an up-to-date SDK. so it wouldn't > run on an older OS version -- and why would anyone want it to -- it comes > with the system. > > Our wheels are built with the 10.6 SDK -- OS-X is supposed to be backward > compatible, so 10.6 SDK code will run on newer OS versions -- but could > there be any clashes if a shared lib is linked in that uses a different SDK > than the host application? I have no idea if this is expected to be robust > -- though the linker doesn't given an error -- so maybe. > I just commented on this issue on the pip PR - our dmg installers are built *on 10.6*, not with the 10.6 SDK. Those two are not equivalent, and the last time I tried using an SDK we had users reporting crashes with the binaries on SF. > > interesting -- the "intel" part of that means is SHOULD be a universal > binary with 32 and 64 bit in there. And on my 10.7 system, it is. So maybe > this should work. > > In fact, if I rename my wheel from >> `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` to >> `numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.macosx_10_7_intel.whl`, >> system python will pick up this wheel, but obviously this could get >> boring for lots of OSX versions, and in any case, it's not really our >> target market for the wheels. [4] >> > > Exactly, though if we can support it easily, maybe good to do. > > Min RK actually has a pull request in to relax this OSX version >> specificity [3] because the wheels should be (and seem to be) >> interoperable, but the take-home is that we're not likely to run into >> trouble with system python. >> > > If we relax things too much, will we also get homebrew and macports and > built-it-myself pythons, and will they work? > Not likely. On 10.7 and 10.8 python.org and system python use llvm-gcc while IIRC homebrew used clang. There's probably more differences. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat Mar 8 16:06:13 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 8 Mar 2014 13:06:13 -0800 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, Thanks to Chuck and Jarrod for giving me upload permission - wheels are on sourceforge now: https://sourceforge.net/projects/numpy/files/NumPy/1.8.1rc1 Until the wheels reach pypi, you'll have to test by: * downloading the python 2.7 or 3.3 wheel to a directory (say the current directory) and: * upgrading pip to latest (1.5.4): http://pip.readthedocs.org/en/latest/installing.html * pip install --pre --find-links . numpy The wheels match the python.org python. You can make them install on system python at your own risk (likely small) with `pip install numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` (hoping you do in fact have python 2.7 as your system python). Feedback very welcome, especially from anyone running homebrew python trying the direct pip install above. Cheers, Matthew From piem at piem.org Sat Mar 8 16:54:44 2014 From: piem at piem.org (Paul Brossier) Date: Sat, 08 Mar 2014 18:54:44 -0300 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API Message-ID: <531B91A4.8080506@piem.org> hi all, I'm trying to understand how to use the deprecation mechanism. 1. When not defining NPY_NO_DEPRECATED_API, I get warnings (as expected): #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" 2. When defining NPY_NO_DEPRECATED_API, as mentioned in the above warning and in the documentation, I get this error: #error Should never include npy_deprecated_api directly. 3. If instead I include , I get no warning at all, and the extension builds as expected. Now, why does the second error occur, whereas I do not include npy_deprecated_api.h, and most important, what is the correct way to use the deprecation mechanism? There seems to be a few issues related to this on github, most of them closed, but reading them did not help me understanding. The extension i'm trying to improve is aubio and can be found at http://aubio.org. A copy of the relevant code is at: https://github.com/piem/aubio/blob/develop/python/ext/aubio-types.h thanks, piem From charlesr.harris at gmail.com Sat Mar 8 17:25:09 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 8 Mar 2014 15:25:09 -0700 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API In-Reply-To: <531B91A4.8080506@piem.org> References: <531B91A4.8080506@piem.org> Message-ID: On Sat, Mar 8, 2014 at 2:54 PM, Paul Brossier wrote: > hi all, > > I'm trying to understand how to use the deprecation mechanism. > > 1. When not defining NPY_NO_DEPRECATED_API, I get warnings (as expected): > > #warning "Using deprecated NumPy API, disable it by #defining > NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" > > 2. When defining NPY_NO_DEPRECATED_API, as mentioned in the above > warning and in the documentation, I get this error: > > #error Should never include npy_deprecated_api directly. > > 3. If instead I include , I get no > warning at all, and the extension builds as expected. > > Now, why does the second error occur, whereas I do not include > npy_deprecated_api.h, and most important, what is the correct way to use > the deprecation mechanism? > > There seems to be a few issues related to this on github, most of them > closed, but reading them did not help me understanding. > > The extension i'm trying to improve is aubio and can be found at > http://aubio.org. A copy of the relevant code is at: > > https://github.com/piem/aubio/blob/develop/python/ext/aubio-types.h > > You should include the same files whether or not NPY_NO_DEPRECATED_API is defined. Usually numpy/arrayobject.h is the only needed include file. For instance, fftpack_litemodule.c has at the top #define NPY_NO_DEPRECATED_API NPY_API_VERSION #include "fftpack.h" #include "Python.h" #include "numpy/arrayobject.h" Where in this case NPY_API_VERSION is the current version, NPY_1_7_API_VERSION would be the numpy1.7.x API, etc. You use the version you intend to support and things deprecated after that version will still be available to you. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Mon Mar 10 05:38:56 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 10 Mar 2014 10:38:56 +0100 Subject: [Numpy-discussion] SVG Logo of NumPy Message-ID: Hello, is there a SVG version of the NumPy logo ? This would be to be used on my website. Christophe BAL -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Mon Mar 10 05:46:52 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Mon, 10 Mar 2014 10:46:52 +0100 Subject: [Numpy-discussion] SVG Logo of NumPy In-Reply-To: References: Message-ID: <531D8A0C.4050903@crans.org> Le 10/03/2014 10:38, Christophe Bal a ?crit : > is there a SVG version of the NumPy logo ? This would be to be used on > my website. Could it be one of those https://github.com/numpy/numpy/tree/master/branding/icons ? (don't know if it's up to date though) best, Pierre From projetmbc at gmail.com Mon Mar 10 05:55:19 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 10 Mar 2014 10:55:19 +0100 Subject: [Numpy-discussion] SVG Logo of NumPy In-Reply-To: <531D8A0C.4050903@crans.org> References: <531D8A0C.4050903@crans.org> Message-ID: Sorry for my no-brain question. ;-) Thanks for the link. 2014-03-10 10:46 GMT+01:00 Pierre Haessig : > Le 10/03/2014 10:38, Christophe Bal a ?crit : > > is there a SVG version of the NumPy logo ? This would be to be used on > > my website. > Could it be one of those > https://github.com/numpy/numpy/tree/master/branding/icons ? > (don't know if it's up to date though) > > best, > Pierre > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Mar 10 12:39:21 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 10 Mar 2014 09:39:21 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Sat, Mar 8, 2014 at 2:22 AM, Ralf Gommers wrote: > If we relax things too much, will we also get homebrew and macports and > built-it-myself pythons, and will they work? > > Not likely. On 10.7 and 10.8 python.org and system python use llvm-gcc > while IIRC homebrew used clang. There's probably more differences. > So the question is: If someone is running a brew python and does "pip install numpy" will pip find a binary wheel that will then not work? That would be bad, but maybe not our problem -- brew users should be using brew to build numpy anyway. It would be nicer if it didn't happen, though. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From jenshnielsen at gmail.com Mon Mar 10 13:27:46 2014 From: jenshnielsen at gmail.com (Jens Nielsen) Date: Mon, 10 Mar 2014 17:27:46 +0000 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Mon, Mar 10, 2014 at 4:39 PM, Chris Barker wrote: > On Sat, Mar 8, 2014 at 2:22 AM, Ralf Gommers wrote: > >> If we relax things too much, will we also get homebrew and macports and >> built-it-myself pythons, and will they work? >> >> Not likely. On 10.7 and 10.8 python.org and system python use llvm-gcc >> while IIRC homebrew used clang. There's probably more differences. >> > > So the question is: > > If someone is running a brew python and does "pip install numpy" will pip > find a binary wheel that will then not work? That would be bad, but maybe > not our problem -- brew users should be using brew to build numpy anyway. > No that is not how homebrew works. Brew does not pack anything that pip installable: https://github.com/Homebrew/homebrew/wiki/Homebrew-and-Python so that would be a big issue. Especially since installing numpy with pip and brew python just works without any problem when building from source. There are unofficial brew taps (channels) with python packages but this not the recommended way. /Jens > It would be nicer if it didn't happen, though. > > -CHB > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 10 14:07:53 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 10 Mar 2014 19:07:53 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Mon, Mar 10, 2014 at 6:27 PM, Jens Nielsen wrote: > > > > On Mon, Mar 10, 2014 at 4:39 PM, Chris Barker wrote: > >> On Sat, Mar 8, 2014 at 2:22 AM, Ralf Gommers wrote: >> >>> If we relax things too much, will we also get homebrew and macports and >>> built-it-myself pythons, and will they work? >>> >>> Not likely. On 10.7 and 10.8 python.org and system python use llvm-gcc >>> while IIRC homebrew used clang. There's probably more differences. >>> >> >> So the question is: >> >> If someone is running a brew python and does "pip install numpy" will >> pip find a binary wheel that will then not work? That would be bad, but >> maybe not our problem -- brew users should be using brew to build numpy >> anyway. >> > "pip install numpy" will build from source, and if "pip install numpy --use-wheel" doesn't work I guess we should fix it by documentation that in all the obvious places. Ralf > > No that is not how homebrew works. Brew does not pack anything that pip > installable: https://github.com/Homebrew/homebrew/wiki/Homebrew-and-Python > so that would be a big issue. Especially since installing numpy with pip > and brew python just works without any problem when building from source. > > There are unofficial brew taps (channels) with python packages but this > not the recommended way. > > /Jens > >> It would be nicer if it didn't happen, though. >> >> -CHB >> >> -- >> >> Christopher Barker, Ph.D. >> Oceanographer >> >> Emergency Response Division >> NOAA/NOS/OR&R (206) 526-6959 voice >> 7600 Sand Point Way NE (206) 526-6329 fax >> Seattle, WA 98115 (206) 526-6317 main reception >> >> Chris.Barker at noaa.gov >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 10 14:09:12 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 10 Mar 2014 19:09:12 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Sat, Mar 8, 2014 at 10:06 PM, Matthew Brett wrote: > Hi, > > Thanks to Chuck and Jarrod for giving me upload permission - wheels > are on sourceforge now: > > https://sourceforge.net/projects/numpy/files/NumPy/1.8.1rc1 > Nice! > Until the wheels reach pypi, you'll have to test by: > If you send me your pypi username I'll give you the right permissions there also. Ralf > * downloading the python 2.7 or 3.3 wheel to a directory (say the > current directory) and: > * upgrading pip to latest (1.5.4): > http://pip.readthedocs.org/en/latest/installing.html > * pip install --pre --find-links . numpy > > The wheels match the python.org python. You can make them install on > system python at your own risk (likely small) with `pip install > numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl` (hoping you do in fact > have python 2.7 as your system python). > > Feedback very welcome, especially from anyone running homebrew python > trying the direct pip install above. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bryanv at continuum.io Mon Mar 10 14:17:13 2014 From: bryanv at continuum.io (Bryan Van de Ven) Date: Mon, 10 Mar 2014 14:17:13 -0400 Subject: [Numpy-discussion] ANN: Bokeh 0.4.2 Message-ID: <1F60C78D-44CA-4803-B305-E40E4B1884A8@continuum.io> I am happy to announce the release of Bokeh version 0.4.2! Bokeh is a Python library for visualizing large and realtime datasets on the web. Its goal is to provide elegant, concise construction of novel graphics in the style of Protovis/D3, while delivering high-performance interactivity to thin clients. Bokeh includes its own Javascript library (BokehJS) that implements a reactive scenegraph representation of the plot, and renders efficiently to HTML5 Canvas. Bokeh works well with IPython Notebook, but can generate standalone graphics that embed into regular HTML. Check out the full documentation, interactive gallery, and tutorial at http://bokeh.pydata.org If you are using Anaconda, you can install with conda: conda install bokeh Alternatively, you can install with pip: pip install bokeh Some of the new features in this release include: * Additional Matplotlib and Seaborn compatibility (PolyCollection) * Extensive tutorial with exercises and solutions added to docs * new %bokeh magic for improved IPython notebook integration * Windows support for bokeh-server with two new storage backends (in-memory and shelve) Also, we've fixed lots of little bugs - see the CHANGELOG for full details. BokehJS is also available by CDN for use in standalone javascript applications: http://cdn.pydata.org/bokeh-0.4.2.js http://cdn.pydata.org/bokeh-0.4.2.css http://cdn.pydata.org/bokeh-0.4.2.min.js http://cdn.pydata.org/bokeh-0.4.2.min.css Some examples of BokehJS use can be found on the Bokeh JSFiddle page: http://jsfiddle.net/user/bokeh/fiddles/ The release of Bokeh 0.5 is planned for late March. Some notable features we plan to include are: * Abstract Rendering for semantically meaningful downsampling of large datasets * Better grid-based layout system, using Cassowary.js * More MPL/Seaborn/ggplot.py compatibility and examples * Additional tools, improved interactions, and better plot frame * Touch support Issues, enhancement requests, and pull requests can be made on the Bokeh Github page: https://github.com/continuumio/bokeh Questions can be directed to the Bokeh mailing list: bokeh at continuum.io Special thanks to recent contributors: Melissa Gymrek, Amy Troschinetz, Ben Zaitlen, Damian Avila, and Terry Jones Regards, Bryan Van de Ven Continuum Analytics http://continuum.io From chris.barker at noaa.gov Mon Mar 10 15:37:17 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 10 Mar 2014 12:37:17 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Mon, Mar 10, 2014 at 10:27 AM, Jens Nielsen wrote: > If someone is running a brew python and does "pip install numpy" will pip > find a binary wheel that will then not work? That would be bad, but maybe > not our problem -- brew users should be using brew to build numpy anyway. > > No that is not how homebrew works. Brew does not pack anything that pip > installable: https://github.com/Homebrew/homebrew/wiki/Homebrew-and-Python > so that would be a big issue. Especially since installing numpy with pip > and brew python just works without any problem when building from source. > > There are unofficial brew taps (channels) with python packages but this > not the recommended way. > OK, then I'll re-phrase: Brew users should be building from source anyway, even if they use pip to do it. But it would be bad if they had to go out of their way to do that -- I've lost track of pip's default behavior in this regard. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 10 17:45:44 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 10 Mar 2014 22:45:44 +0100 Subject: [Numpy-discussion] GSoC application template available Message-ID: Hi GSoC students, The PSF just made their application template for this year available: https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014. There are a few things in there that are required (for one, submit a patch to numpy or scipy if you haven't done so yet), and some good recommendations. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Mar 10 19:21:32 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 10 Mar 2014 16:21:32 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Mon, Mar 10, 2014 at 12:37 PM, Chris Barker wrote: > On Mon, Mar 10, 2014 at 10:27 AM, Jens Nielsen > wrote: >> >> If someone is running a brew python and does "pip install numpy" will pip >> find a binary wheel that will then not work? That would be bad, but maybe >> not our problem -- brew users should be using brew to build numpy anyway. >> >> No that is not how homebrew works. Brew does not pack anything that pip >> installable: https://github.com/Homebrew/homebrew/wiki/Homebrew-and-Python >> so that would be a big issue. Especially since installing numpy with pip >> and brew python just works without any problem when building from source. >> >> There are unofficial brew taps (channels) with python packages but this >> not the recommended way. > > > OK, then I'll re-phrase: > > Brew users should be building from source anyway, even if they use pip to do > it. > > But it would be bad if they had to go out of their way to do that -- I've > lost track of pip's default behavior in this regard. We should be fine for homebrew. I believe: 1) The wheels work for homebrew as well 2) Homebrew won't pick them up by default at the moment Homebrew python has a more specific architecture version then the python.org python or system python: $ /usr/local/Cellar/python/2.7.6/bin/python -c "import distutils.util; print(distutils.util.get_platform())" macosx-10.9-x86_64 So - for now - pip won't recognize the 10.6_intel wheel as matching the required 10.9_x86_64 suffix on the wheel (can be solved by renaming the file as further up the thread). But - in any case - the wheels work fine in homebrew: mkvirtualenv homebrew-2.7 --python=/usr/local/Cellar/python/2.7.6/bin/python pip install nose # this doesn't see the wheel - because of the platform tag # starts to download and install from source pip install --pre --find-links . numpy # this works pip install numpy-1.8.1rc1-cp27-none-macosx_10_6_intel.whl # tests all pass python -c 'import numpy; numpy.test()' # We're picking up the right numpy python -c 'import numpy; print(numpy.__file__)' deactivate # same for 3.3 mkvirtualenv homebrew-3.3 --python=/usr/local/Cellar/python3/3.3.5/bin/python3 pip install nose python -c 'import numpy' pip install --pre --find-links . numpy pip install numpy-1.8.1rc1-cp33-cp33m-macosx_10_6_intel.whl python -c 'import numpy; numpy.test()' So I think we're good. We should surely set up automated testing to assure ourselves that this is still true, when there's time, it shouldn't take very long. Cheers, Matthew From matthew.brett at gmail.com Tue Mar 11 04:26:48 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 11 Mar 2014 01:26:48 -0700 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: Hi, On Mon, Mar 10, 2014 at 11:09 AM, Ralf Gommers wrote: > > > > On Sat, Mar 8, 2014 at 10:06 PM, Matthew Brett > wrote: >> >> Hi, >> >> Thanks to Chuck and Jarrod for giving me upload permission - wheels >> are on sourceforge now: >> >> https://sourceforge.net/projects/numpy/files/NumPy/1.8.1rc1 > > > Nice! > >> >> Until the wheels reach pypi, you'll have to test by: > > > If you send me your pypi username I'll give you the right permissions there > also. Thanks - that would be great - 'matthew.brett' on pypi. Cheers, Matthew From piem at piem.org Tue Mar 11 09:27:45 2014 From: piem at piem.org (Paul Brossier) Date: Tue, 11 Mar 2014 10:27:45 -0300 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API In-Reply-To: References: <531B91A4.8080506@piem.org> Message-ID: <531F0F51.6040503@piem.org> On 08/03/2014 19:25, Charles R Harris wrote: > Thanks for your quick reply Charles. > On Sat, Mar 8, 2014 at 2:54 PM, Paul Brossier > wrote: > > > > 2. When defining NPY_NO_DEPRECATED_API, as mentioned in the above > > warning and in the documentation, I get this error: > > > > #error Should never include npy_deprecated_api directly. ok, this error was triggered by some older version of numpy, 1.8.0.dev-4600b2f-20130131. updating to 1.8.0 fixed it. sorry for the noise! > > The extension i'm trying to improve is aubio and can be found at > > http://aubio.org. A copy of the relevant code is at: > > > > https://github.com/piem/aubio/blob/develop/python/ext/aubio-types.h > > > > You should include the same files whether or not NPY_NO_DEPRECATED_API > is defined. Usually numpy/arrayobject.h is the only needed include > file. For instance, fftpack_litemodule.c has at the top > > #define NPY_NO_DEPRECATED_API NPY_API_VERSION > > #include "fftpack.h" > #include "Python.h" > #include "numpy/arrayobject.h" Yes, that's pretty much what I do in aubio. > Where in this case NPY_API_VERSION is the current version, > NPY_1_7_API_VERSION would be the numpy1.7.x API, etc. You use the > version you intend to support and things deprecated after that version > will still be available to you. If I understand correctly, the current version is the one installed on the user system. So using NPY_API_VERSION would mean "this code should work with any version of numpy". I guess this is what I want (I would even expect this to be the default setting). Did I miss something? Thanks, Paul From njs at pobox.com Tue Mar 11 09:49:08 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 11 Mar 2014 13:49:08 +0000 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API In-Reply-To: <531F0F51.6040503@piem.org> References: <531B91A4.8080506@piem.org> <531F0F51.6040503@piem.org> Message-ID: On 11 Mar 2014 13:28, "Paul Brossier" wrote: > If I understand correctly, the current version is the one installed on > the user system. So using NPY_API_VERSION would mean "this code should > work with any version of numpy". I guess this is what I want (I would > even expect this to be the default setting). Did I miss something? Using NPY_API_VERSION here means "this code will work with any version of numpy, *including ones that aren't released yet and might have arbitrary API changes*". This is almost certainly not what you want. The idea of the deprecation support is that it gives you a grace period to adapt to upcoming changes before they break your code. Suppose PyArray_foo is going to be removed in numpy 1.10. If we just removed it, your first warning would be when we release 1.10 and suddenly you have angry users who find your software no longer works. So the trick is that before we remove it entirely, we release 1.9, in which PyArray_foo is available if your NPY_DEPRECATED_API version is set to 1.8 or earlier, but not if it's set to 1.9. Your released versions thus continue to work, your users are happy, and the first person to encounter the problem is you, when you try to update your NPY_DEPRECATED_API to 1.9. You fix the problem, you make a new release, and then when 1.10 comes along everything works. Moral: set NPY_DEPRECATED_API to match the highest numpy version you've tested. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From piem at piem.org Tue Mar 11 10:25:04 2014 From: piem at piem.org (Paul Brossier) Date: Tue, 11 Mar 2014 11:25:04 -0300 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API In-Reply-To: References: <531B91A4.8080506@piem.org> <531F0F51.6040503@piem.org> Message-ID: <531F1CC0.5040705@piem.org> On 11/03/2014 10:49, Nathaniel Smith wrote: > On 11 Mar 2014 13:28, "Paul Brossier" > wrote: >> If I understand correctly, the current version is the one installed on >> the user system. So using NPY_API_VERSION would mean "this code should >> work with any version of numpy". I guess this is what I want (I would >> even expect this to be the default setting). Did I miss something? > > Using NPY_API_VERSION here means "this code will work with any version > of numpy, *including ones that aren't released yet and might have > arbitrary API changes*". > > This is almost certainly not what you want. Thanks for the clarification. > The idea of the deprecation support is that it gives you a grace period > to adapt to upcoming changes before they break your code. Suppose > PyArray_foo is going to be removed in numpy 1.10. If we just removed it, > your first warning would be when we release 1.10 and suddenly you have > angry users who find your software no longer works. So the trick is that > before we remove it entirely, we release 1.9, in which PyArray_foo is > available if your NPY_DEPRECATED_API version is set to 1.8 or earlier, > but not if it's set to 1.9. Your released versions thus continue to > work, your users are happy, and the first person to encounter the > problem is you, when you try to update your NPY_DEPRECATED_API to 1.9. > You fix the problem, you make a new release, and then when 1.10 comes > along everything works. > > Moral: set NPY_DEPRECATED_API to match the highest numpy version you've > tested. I guess you meant NPY_NO_DEPRECATED_API? Paul From njs at pobox.com Tue Mar 11 10:58:27 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 11 Mar 2014 14:58:27 +0000 Subject: [Numpy-discussion] c api deprecations with NPY_NO_DEPRECATED_API In-Reply-To: <531F1CC0.5040705@piem.org> References: <531B91A4.8080506@piem.org> <531F0F51.6040503@piem.org> <531F1CC0.5040705@piem.org> Message-ID: On 11 Mar 2014 14:25, "Paul Brossier" wrote: > > On 11/03/2014 10:49, Nathaniel Smith wrote: > > On 11 Mar 2014 13:28, "Paul Brossier" > > wrote: > >> If I understand correctly, the current version is the one installed on > >> the user system. So using NPY_API_VERSION would mean "this code should > >> work with any version of numpy". I guess this is what I want (I would > >> even expect this to be the default setting). Did I miss something? > > > > Using NPY_API_VERSION here means "this code will work with any version > > of numpy, *including ones that aren't released yet and might have > > arbitrary API changes*". > > > > This is almost certainly not what you want. > > Thanks for the clarification. > > > The idea of the deprecation support is that it gives you a grace period > > to adapt to upcoming changes before they break your code. Suppose > > PyArray_foo is going to be removed in numpy 1.10. If we just removed it, > > your first warning would be when we release 1.10 and suddenly you have > > angry users who find your software no longer works. So the trick is that > > before we remove it entirely, we release 1.9, in which PyArray_foo is > > available if your NPY_DEPRECATED_API version is set to 1.8 or earlier, > > but not if it's set to 1.9. Your released versions thus continue to > > work, your users are happy, and the first person to encounter the > > problem is you, when you try to update your NPY_DEPRECATED_API to 1.9. > > You fix the problem, you make a new release, and then when 1.10 comes > > along everything works. > > > > Moral: set NPY_DEPRECATED_API to match the highest numpy version you've > > tested. > > I guess you meant NPY_NO_DEPRECATED_API? Yes. I'm just too lazy to check these things on my phone :-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 11 15:22:35 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Mar 2014 20:22:35 +0100 Subject: [Numpy-discussion] 2014 John Hunter Fellowship - Call for Applications Message-ID: Hi all, I'm excited to announce, on behalf of the Numfocus board, that applications for the 2014 John Hunter Technology Fellowship are now being accepted. This is the first fellowship Numfocus is able to offer, which we see as a significant milestone. The John Hunter Technology Fellowship aims to bridge the gap between academia and real-world, open-source scientific computing projects by providing a capstone experience for individuals coming from a scientific, engineering or mathematics background. The program consists of a 6 month project-based training program for postdoctoral scientists or senior graduate students. Fellows work on scientific computing open source projects under the guidance of mentors who are leading scientists and software engineers. The aim of the Fellowship is to enable Fellows to develop the skills needed to contribute to cutting-edge open source software projects while at the same time advancing or supporting the research program they and their mentor are involved in. While proposals in any area of science and engineering are welcome, the following areas are encouraged in particular: - Accessible and reproducible computing - Enabling technology for open access publishing - Infrastructural technology supporting open-source scientific software stacks - Core open-source projects promoted by NumFOCUS Eligible applicants are postdoctoral scientists or senior PhD students, or have equivalent experience in physics, mathematics, engineering, statistics, or a related science. The program is open to applicants from any nationality and can be performed at any university or institute world-wide (US export laws permitting). All applications are due May 15, 2014 by 11:59 p.m. Central Standard Time. For more details on the program see: http://numfocus.org/john_hunter_fellowship_2014.html (this call) http://numfocus.org/fellowships.html (program) And for some background see this blog post: http://numfocus.org/announcing-the-numfocus-technology-fellowship-program.html We're looking forward to receiving your applications! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 11 17:18:35 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 11 Mar 2014 22:18:35 +0100 Subject: [Numpy-discussion] 1.8.1rc1 on sourceforge. In-Reply-To: References: <5315CBD9.5040201@web.de> <53160894.7070207@uci.edu> <53160D81.3030804@googlemail.com> Message-ID: On Tue, Mar 11, 2014 at 9:26 AM, Matthew Brett wrote: > Hi, > > On Mon, Mar 10, 2014 at 11:09 AM, Ralf Gommers > wrote: > > > > > > > > On Sat, Mar 8, 2014 at 10:06 PM, Matthew Brett > > wrote: > >> > >> Hi, > >> > >> Thanks to Chuck and Jarrod for giving me upload permission - wheels > >> are on sourceforge now: > >> > >> https://sourceforge.net/projects/numpy/files/NumPy/1.8.1rc1 > > > > > > Nice! > > > >> > >> Until the wheels reach pypi, you'll have to test by: > > > > > > If you send me your pypi username I'll give you the right permissions > there > > also. > > Thanks - that would be great - 'matthew.brett' on pypi. > You're a numpy admin now. Thanks for picking this up, will be quite useful in the long run! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhattersley at gmail.com Wed Mar 12 10:18:12 2014 From: rhattersley at gmail.com (R Hattersley) Date: Wed, 12 Mar 2014 14:18:12 +0000 Subject: [Numpy-discussion] ANN: Biggus 0.5 Message-ID: I'm pleased to announce the release of Biggus version 0.5.0. Biggus is a pure Python library for handling virtual n-dimensional arrays of arbitrary size, and providing lazy/deferred evaluation of arithmetic and statistical operations. Biggus works with your n-dimensional array data in whatever form it currently resides - no data conversion is necessary. The documentation can be found at: http://biggus.readthedocs.org/ And it can be installed with pip via: pip install biggus The main feature of this release is a new multi-threaded evaluation engine which provides the ability to compute statistics efficiently across any axis. Future plans include: - Expand the range of arithmetic and statistical operators. - Extend the evaluation API to allow multiple simultaneous file-based outputs. - An API to simplify the creation of user-defined operators. - Multiply-concurrent execution of CPU intensive operators. - Low-level evaluation optimisation, e.g numexpr/numba. Constructive feedback is very welcome! - Raise an issue on GitHub, https://github.com/SciTools/biggus - Post on the discussion group: https://groups.google.com/forum/#!forum/scitools-biggus -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmao20001 at gmail.com Wed Mar 12 12:52:18 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Thu, 13 Mar 2014 00:52:18 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal Message-ID: Hi, The attachment is my draft of proposal. The project is "vector math library integration". I think I need some feedback to make it solider. Any comment will be appreciated. Thanks in advance. Regards, Leo Mao -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Wed Mar 12 12:54:56 2014 From: aron at ahmadia.net (Aron Ahmadia) Date: Wed, 12 Mar 2014 12:54:56 -0400 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: Hi Leo, Out of curiosity, which vector math libraries did you have in mind as likely candidates for inclusion? How are you planning on selecting the library to integrate? Cheers, Aron On Wed, Mar 12, 2014 at 12:52 PM, Leo Mao wrote: > Hi, > The attachment is my draft of proposal. The project is "vector math > library integration". > I think I need some feedback to make it solider. > Any comment will be appreciated. > Thanks in advance. > > Regards, > Leo Mao > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmao20001 at gmail.com Wed Mar 12 13:12:51 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Thu, 13 Mar 2014 01:12:51 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: Hi Aron, Previously mentioned by Julian, Yeppp may be a good candidate. As for selecting a good library, I will consider the performance and the API of the library. The integration of the library should improve the performance of numpy and also not make the source too complicated to maintain. And I think the library should be mature so that the API will not be changed significantly. Please point out if there is something I miss. Also I will be grateful to any suggestions for my proposal. Regards, Leo Mao On Thu, Mar 13, 2014 at 12:54 AM, Aron Ahmadia wrote: > Hi Leo, > > Out of curiosity, which vector math libraries did you have in mind as > likely candidates for inclusion? How are you planning on selecting the > library to integrate? > > Cheers, > Aron > > > On Wed, Mar 12, 2014 at 12:52 PM, Leo Mao wrote: > >> Hi, >> The attachment is my draft of proposal. The project is "vector math >> library integration". >> I think I need some feedback to make it solider. >> Any comment will be appreciated. >> Thanks in advance. >> >> Regards, >> Leo Mao >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Mar 12 15:09:20 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 12 Mar 2014 13:09:20 -0600 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: On Wed, Mar 12, 2014 at 11:12 AM, Leo Mao wrote: > Hi Aron, > > Previously mentioned by Julian, Yeppp may be a good candidate. > As for selecting a good library, I will consider the performance and the > API of the library. > The integration of the library should improve the performance of numpy and > also not make the source too complicated to maintain. > And I think the library should be mature so that the API will not be > changed significantly. > > Please point out if there is something I miss. > Also I will be grateful to any suggestions for my proposal. > > Regards, > Leo Mao > > > On Thu, Mar 13, 2014 at 12:54 AM, Aron Ahmadia wrote: > >> Hi Leo, >> >> Out of curiosity, which vector math libraries did you have in mind as >> likely candidates for inclusion? How are you planning on selecting the >> library to integrate? >> >> Cheers, >> Aron >> >> >> On Wed, Mar 12, 2014 at 12:52 PM, Leo Mao wrote: >> >>> Hi, >>> The attachment is my draft of proposal. The project is "vector math >>> library integration". >>> I think I need some feedback to make it solider. >>> Any comment will be appreciated. >>> Thanks in advance. >>> >>> Regards, >>> Leo Mao >>> >>> The proposal as it stands is too open ended and lacking in specifics. Probably you should select a library before the start of GSOC, or at least have a list of candidates, and also narrow the part of numpy you want to improve to something definite: linalg, special functions, etc. That doesn't mean you can't do more if time allows ;) An estimate of expected gains over current code would also help. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 12 18:04:36 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 12 Mar 2014 22:04:36 +0000 Subject: [Numpy-discussion] Matrix multiplication infix operator PEP nearly ready to go Message-ID: Hi all, The proposal to add an infix operator to Python for matrix multiplication is nearly ready for its debut on python-ideas; so if you want to look it over first, just want to check out where it's gone, then now's a good time: https://github.com/numpy/numpy/pull/4351 The basic idea here is to try to make the strongest argument we can for the simplest extension that we actually want, and then whether it gets accepted or rejected at least we'll know that's final. Absolutely all comments and feedback welcome. Cheers, -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From alan.isaac at gmail.com Wed Mar 12 21:03:46 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 12 Mar 2014 21:03:46 -0400 Subject: [Numpy-discussion] Matrix multiplication infix operator PEP nearly ready to go In-Reply-To: References: Message-ID: <532103F2.4050609@gmail.com> On 3/12/2014 6:04 PM, Nathaniel Smith wrote: > https://github.com/numpy/numpy/pull/4351 The Semantics section still begins with 0d, then 2d, then 1d, then nd. Given the context of the proposal, the order should be: 2d (the core need expressed in the proposal) nd (which generalizes via broadcasting the 2d behavior) 1d (special casing) 0d (error) In this context I see one serious problem: is there a NumPy function that produces the proposed nd behavior? If not why not, and can it really be sold as a core need if the need to implement it has never been pressed to the point of an implementation? Unless this behavior is first implemented, the obvious question remains: why will `@` not just implement `dot`, for which there is a well tested and much used implementation? Note I am not taking a position on the semantics. I'm just pointing out a question that is sure to arise. Cheers, Alan From njs at pobox.com Wed Mar 12 23:40:49 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 13 Mar 2014 03:40:49 +0000 Subject: [Numpy-discussion] Matrix multiplication infix operator PEP nearly ready to go In-Reply-To: <532103F2.4050609@gmail.com> References: <532103F2.4050609@gmail.com> Message-ID: On Thu, Mar 13, 2014 at 1:03 AM, Alan G Isaac wrote: > On 3/12/2014 6:04 PM, Nathaniel Smith wrote: >> https://github.com/numpy/numpy/pull/4351 > > The Semantics section still begins with 0d, then 2d, then 1d, then nd. > Given the context of the proposal, the order should be: > > 2d (the core need expressed in the proposal) > nd (which generalizes via broadcasting the 2d behavior) > 1d (special casing) > 0d (error) I've just switched it to 2d -> 1d -> 3d+ -> 0d. You're right that 2d should go first, but IMO 1d should go after it because 2d and 1d are the two cases that really get used heavily in practice. > In this context I see one serious problem: is there a NumPy function > that produces the proposed nd behavior? If not why not, and > can it really be sold as a core need if the need to implement > it has never been pressed to the point of an implementation? The logic isn't "we have a core need to implement these exact semantics". It's: "we have a core need for this operator; given that we are adding an operator we have to figure out exactly what the semantics should be; we did that and documented it and got consensus from a bunch of projects on it". I don't think the actual details of the semantics matter nearly as much as the fact that they exist. > Unless this behavior is first implemented, the obvious question remains: > why will `@` not just implement `dot`, for which there is a well > tested and much used implementation? Because of the reason above, I'm not sure it will come up (I don't think python-dev is nearly as familiar with the corner cases of numpy.dot as we are :-)). But if it does the answer is easy: no-one ever thought through exactly how `dot` should work in these rare edge cases, now we did. But we can't just change `dot` quickly, because of backwards compatibility considerations. `@` is new, there's no compatibility problem, so we might as well get it right from the start. If the behavioural differences between `dot` and `@` were more controversial then I'd worry more. But the consequences of the 0d thing are trivial to understand, and in the 3d+ case we're already shipping dozens of functions that have exactly these broadcasting semantics. -n From ralf.gommers at gmail.com Thu Mar 13 03:27:32 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 13 Mar 2014 08:27:32 +0100 Subject: [Numpy-discussion] GSoC application template available In-Reply-To: References: Message-ID: On Mon, Mar 10, 2014 at 10:45 PM, Ralf Gommers wrote: > Hi GSoC students, > > The PSF just made their application template for this year available: > https://wiki.python.org/moin/SummerOfCode/ApplicationTemplate2014. There > are a few things in there that are required (for one, submit a patch to > numpy or scipy if you haven't done so yet), and some good recommendations. > Also a heads up that Google has changes the process a bit; you'll be required to provide proof that you're a student now instead of later on. Apparently this slows down the initial part of the application, so it's even more important that you submit your (draft) proposals early on. Note that you can keep on editing in Melange until the deadline. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sudheer.joseph at yahoo.com Thu Mar 13 11:52:55 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Thu, 13 Mar 2014 23:52:55 +0800 (SGT) Subject: [Numpy-discussion] python array Message-ID: <1394725975.8249.YahooMailNeo@web193402.mail.sg3.yahoo.com> Dear experts, ???????????????????? I am encountering a strange behaviour of python data array as below. I have been trying to use the data from a netcdf file(attached herewith) to do certain calculation using below code. If I take absolute value of the same array and look for values <.5? I get a different value than the original array. But the fact is that this particular case do not have any negative values in the array( but there are other files where it can have negative values so the condition is put). I do not see any reason for getting different numbers for values <.5 in case of bt and expected it to be same as that of r2010. If any one has a guess on what is behind this behaviour please help. In [14]: from netCDF4 import Dataset as nc In [15]: nf=nc('r2010.nc') In [16]: r2010=nf.variables['R2010'][:] In [17]: bt=abs(r2010) In [18]: bt[bt<=.5].shape Out[18]: (2872,) In [19]: r2010[r2010<.5].shape Out[19]: (36738,) bt.min() Out[20]: 0.0027588337040836768 In [21]: bt.max() Out[21]: 3.5078965479057089 In [22]: r2010.max() Out[22]: 3.5078965479057089 In [23]: r2010.min() Out[23]: 0.0027588337040836768 ? *************************************************************** Sudheer Joseph Indian National Centre for Ocean Information Services Ministry of Earth Sciences, Govt. of India POST BOX NO: 21, IDA Jeedeemetla P.O. Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com Web- http://oppamthadathil.tripod.com *************************************************************** -------------- next part -------------- A non-text attachment was scrubbed... Name: r2010.nc Type: application/x-netcdf Size: 701512 bytes Desc: not available URL: From Nicolas.Rougier at inria.fr Thu Mar 13 12:39:19 2014 From: Nicolas.Rougier at inria.fr (Nicolas Rougier) Date: Thu, 13 Mar 2014 17:39:19 +0100 Subject: [Numpy-discussion] python array In-Reply-To: <1394725975.8249.YahooMailNeo@web193402.mail.sg3.yahoo.com> References: <1394725975.8249.YahooMailNeo@web193402.mail.sg3.yahoo.com> Message-ID: Seems to be related to the masked values: print r2010[:3,:3] [[-- -- --] [-- -- --] [-- -- --]] print abs(r2010)[:3,:3] [[-- -- --] [-- -- --] [-- -- --]] print r2010[ r2010[:3,:3] <0 ] [-- -- -- -- -- -- -- -- --] print r2010[ abs(r2010)[:3,:3] < 0] [] Nicolas On 13 Mar 2014, at 16:52, Sudheer Joseph wrote: > Dear experts, > I am encountering a strange behaviour of python data array as below. I have been trying to use the data from a netcdf file(attached herewith) to do certain calculation using below code. If I take absolute value of the same array and look for values <.5 I get a different value than the original array. But the fact is that this particular case do not have any negative values in the array( but there are other files where it can have negative values so the condition is put). I do not see any reason for getting different numbers for values <.5 in case of bt and expected it to be same as that of r2010. If any one has a guess on what is behind this behaviour please help. > > > In [14]: from netCDF4 import Dataset as nc > > In [15]: nf=nc('r2010.nc') > In [16]: r2010=nf.variables['R2010'][:] > In [17]: bt=abs(r2010) > In [18]: bt[bt<=.5].shape > Out[18]: (2872,) > In [19]: r2010[r2010<.5].shape > Out[19]: (36738,) > > > bt.min() > Out[20]: 0.0027588337040836768 > > In [21]: bt.max() > Out[21]: 3.5078965479057089 > In [22]: r2010.max() > Out[22]: 3.5078965479057089 > In [23]: r2010.min() > Out[23]: 0.0027588337040836768 > > > > *************************************************************** > Sudheer Joseph > Indian National Centre for Ocean Information Services > Ministry of Earth Sciences, Govt. of India > POST BOX NO: 21, IDA Jeedeemetla P.O. > Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 > Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), > Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) > E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com > Web- http://oppamthadathil.tripod.com > ***************************************************************_______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Thu Mar 13 13:02:57 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Mar 2014 11:02:57 -0600 Subject: [Numpy-discussion] Removal of doc/numpybook Message-ID: Hi All, In a note on a PR, Ralf has suggested the removal of doc/numpybook. I believe most of the content is now part of the numpy documentation, and the book itself is outdated in some parts. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Mar 13 13:08:43 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Mar 2014 17:08:43 +0000 Subject: [Numpy-discussion] Removal of doc/numpybook In-Reply-To: References: Message-ID: On Thu, Mar 13, 2014 at 5:02 PM, Charles R Harris wrote: > Hi All, > > In a note on a PR, Ralf has suggested the removal of doc/numpybook. I > believe most of the content is now part of the numpy documentation, and the > book itself is outdated in some parts. Sounds reasonable. It would be good to tag the repo before removal so we can refer back to it in case there is anything left to port over. -- Robert Kern From charlesr.harris at gmail.com Thu Mar 13 13:18:01 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 13 Mar 2014 11:18:01 -0600 Subject: [Numpy-discussion] Removal of doc/numpybook In-Reply-To: References: Message-ID: On Thu, Mar 13, 2014 at 11:08 AM, Robert Kern wrote: > On Thu, Mar 13, 2014 at 5:02 PM, Charles R Harris > wrote: > > Hi All, > > > > In a note on a PR, Ralf has suggested the removal of doc/numpybook. I > > believe most of the content is now part of the numpy documentation, and > the > > book itself is outdated in some parts. > > Sounds reasonable. It would be good to tag the repo before removal so > we can refer back to it in case there is anything left to port over. > > Can do. Maybe 'numpybook'? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Mar 13 13:24:36 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 13 Mar 2014 17:24:36 +0000 Subject: [Numpy-discussion] Removal of doc/numpybook In-Reply-To: References: Message-ID: On Thu, Mar 13, 2014 at 5:18 PM, Charles R Harris wrote: > > On Thu, Mar 13, 2014 at 11:08 AM, Robert Kern wrote: >> >> On Thu, Mar 13, 2014 at 5:02 PM, Charles R Harris >> wrote: >> > Hi All, >> > >> > In a note on a PR, Ralf has suggested the removal of doc/numpybook. I >> > believe most of the content is now part of the numpy documentation, and >> > the >> > book itself is outdated in some parts. >> >> Sounds reasonable. It would be good to tag the repo before removal so >> we can refer back to it in case there is anything left to port over. > > Can do. Maybe 'numpybook'? I prefer a slightly more verbose tag for context: 'pre-removal-numpybook'. -- Robert Kern From lmao20001 at gmail.com Thu Mar 13 13:35:09 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Fri, 14 Mar 2014 01:35:09 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: Hi, Thanks a lot for your advice, Chuck. Following your advice, I have modified my draft of proposal. (attachment) I think it still needs more comments so that I can make it better. And I found that maybe I can also make some functions related to linalg (like dot, svd or something else) faster by integrating a proper library into numpy. Regards, Leo Mao On Thu, Mar 13, 2014 at 3:09 AM, Charles R Harris wrote: > > > > On Wed, Mar 12, 2014 at 11:12 AM, Leo Mao wrote: > >> Hi Aron, >> >> Previously mentioned by Julian, Yeppp may be a good candidate. >> As for selecting a good library, I will consider the performance and the >> API of the library. >> The integration of the library should improve the performance of numpy >> and also not make the source too complicated to maintain. >> And I think the library should be mature so that the API will not be >> changed significantly. >> >> Please point out if there is something I miss. >> Also I will be grateful to any suggestions for my proposal. >> >> Regards, >> Leo Mao >> >> >> On Thu, Mar 13, 2014 at 12:54 AM, Aron Ahmadia wrote: >> >>> Hi Leo, >>> >>> Out of curiosity, which vector math libraries did you have in mind as >>> likely candidates for inclusion? How are you planning on selecting the >>> library to integrate? >>> >>> Cheers, >>> Aron >>> >>> >>> On Wed, Mar 12, 2014 at 12:52 PM, Leo Mao wrote: >>> >>>> Hi, >>>> The attachment is my draft of proposal. The project is "vector math >>>> library integration". >>>> I think I need some feedback to make it solider. >>>> Any comment will be appreciated. >>>> Thanks in advance. >>>> >>>> Regards, >>>> Leo Mao >>>> >>>> > The proposal as it stands is too open ended and lacking in specifics. > Probably you should select a library before the start of GSOC, or at least > have a list of candidates, and also narrow the part of numpy you want to > improve to something definite: linalg, special functions, etc. That doesn't > mean you can't do more if time allows ;) An estimate of expected gains over > current code would also help. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From argriffi at ncsu.edu Thu Mar 13 13:43:16 2014 From: argriffi at ncsu.edu (alex) Date: Thu, 13 Mar 2014 13:43:16 -0400 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: On Thu, Mar 13, 2014 at 1:35 PM, Leo Mao wrote: >> > And I found that maybe I can also make some functions related to linalg > (like dot, svd or something else) faster by integrating a proper library > into numpy. I think everyone who wants fast numpy linalg already connects to something like OpenBLAS or MKL. When these are not available, numpy uses its own "lapack-lite" which is way slower. I don't think you are going to beat OpenBLAS, so are you suggesting to speed up the slow default "lapack-lite", or are you proposing something else? From chris.barker at noaa.gov Thu Mar 13 19:53:13 2014 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 13 Mar 2014 16:53:13 -0700 Subject: [Numpy-discussion] python array In-Reply-To: References: <1394725975.8249.YahooMailNeo@web193402.mail.sg3.yahoo.com> Message-ID: <4906500139288932242@unknownmsgid> On Mar 13, 2014, at 9:39 AM, Nicolas Rougier wrote: > > Seems to be related to the masked values: Good hint -- a masked array keeps the "junk" values in the main array. What "abs" are you using -- it may not be mask-aware. ( you want a numpy abs anyway) Also -- I'm not sure I know what happens with Boolean operators on masked arrays when you use them to index. I'd investigate that. (sorry, not at a machine I can play with now) Chris > print r2010[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > print abs(r2010)[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > > print r2010[ r2010[:3,:3] <0 ] > [-- -- -- -- -- -- -- -- --] > > print r2010[ abs(r2010)[:3,:3] < 0] > [] > > Nicolas > > > > On 13 Mar 2014, at 16:52, Sudheer Joseph wrote: > >> Dear experts, >> I am encountering a strange behaviour of python data array as below. I have been trying to use the data from a netcdf file(attached herewith) to do certain calculation using below code. If I take absolute value of the same array and look for values <.5 I get a different value than the original array. But the fact is that this particular case do not have any negative values in the array( but there are other files where it can have negative values so the condition is put). I do not see any reason for getting different numbers for values <.5 in case of bt and expected it to be same as that of r2010. If any one has a guess on what is behind this behaviour please help. >> >> >> In [14]: from netCDF4 import Dataset as nc >> >> In [15]: nf=nc('r2010.nc') >> In [16]: r2010=nf.variables['R2010'][:] >> In [17]: bt=abs(r2010) >> In [18]: bt[bt<=.5].shape >> Out[18]: (2872,) >> In [19]: r2010[r2010<.5].shape >> Out[19]: (36738,) >> >> >> bt.min() >> Out[20]: 0.0027588337040836768 >> >> In [21]: bt.max() >> Out[21]: 3.5078965479057089 >> In [22]: r2010.max() >> Out[22]: 3.5078965479057089 >> In [23]: r2010.min() >> Out[23]: 0.0027588337040836768 >> >> >> >> *************************************************************** >> Sudheer Joseph >> Indian National Centre for Ocean Information Services >> Ministry of Earth Sciences, Govt. of India >> POST BOX NO: 21, IDA Jeedeemetla P.O. >> Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), >> Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com >> Web- http://oppamthadathil.tripod.com >> ***************************************************************_______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From sudheer.joseph at yahoo.com Thu Mar 13 21:09:34 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Fri, 14 Mar 2014 09:09:34 +0800 (SGT) Subject: [Numpy-discussion] python array In-Reply-To: <4906500139288932242@unknownmsgid> Message-ID: <1394759374.84418.YahooMailBasic@web193405.mail.sg3.yahoo.com> Thank you very much Nicolas and Chris, The hint was helpful and from that I treid below steps ( a crude way I would say) and getting same result now I have been using abs available by default and it is the same with numpy.absolute( i checked). nr= ((r2010>r2010.min()) & (r2010 wrote: Subject: Re: [Numpy-discussion] python array To: "Discussion of Numerical Python" Date: Thursday, 13 March, 2014, 11:53 PM On Mar 13, 2014, at 9:39 AM, Nicolas Rougier wrote: > > Seems to be related to the masked values: Good hint -- a masked array keeps the "junk" values in the main array. What "abs" are you using -- it may not be mask-aware. ( you want a numpy abs anyway) Also -- I'm not sure I know what happens with Boolean operators on masked arrays when you use them to index. I'd investigate that. (sorry, not at a machine I can play with now) Chris > print r2010[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > print abs(r2010)[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > > print r2010[ r2010[:3,:3] <0 ] > [-- -- -- -- -- -- -- -- --] > > print r2010[ abs(r2010)[:3,:3] < 0] > [] > > Nicolas > > > > On 13 Mar 2014, at 16:52, Sudheer Joseph wrote: > >> Dear experts, >>? ? ? ? ? ? ? ? ? ???I am encountering a strange behaviour of python data array as below. I have been trying to use the data from a netcdf file(attached herewith) to do certain calculation using below code. If I take absolute value of the same array and look for values <.5? I get a different value than the original array. But the fact is that this particular case do not have any negative values in the array( but there are other files where it can have negative values so the condition is put). I do not see any reason for getting different numbers for values <.5 in case of bt and expected it to be same as that of r2010. If any one has a guess on what is behind this behaviour please help. >> >> >> In [14]: from netCDF4 import Dataset as nc >> >> In [15]: nf=nc('r2010.nc') >> In [16]: r2010=nf.variables['R2010'][:] >> In [17]: bt=abs(r2010) >> In [18]: bt[bt<=.5].shape >> Out[18]: (2872,) >> In [19]: r2010[r2010<.5].shape >> Out[19]: (36738,) >> >> >> bt.min() >> Out[20]: 0.0027588337040836768 >> >> In [21]: bt.max() >> Out[21]: 3.5078965479057089 >> In [22]: r2010.max() >> Out[22]: 3.5078965479057089 >> In [23]: r2010.min() >> Out[23]: 0.0027588337040836768 >> >> >> >> *************************************************************** >> Sudheer Joseph >> Indian National Centre for Ocean Information Services >> Ministry of Earth Sciences, Govt. of India >> POST BOX NO: 21, IDA Jeedeemetla P.O. >> Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), >> Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com >> Web- http://oppamthadathil.tripod.com >> ***************************************************************_______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From sudheer.joseph at yahoo.com Thu Mar 13 21:14:18 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Fri, 14 Mar 2014 09:14:18 +0800 (SGT) Subject: [Numpy-discussion] python array In-Reply-To: <1394759374.84418.YahooMailBasic@web193405.mail.sg3.yahoo.com> Message-ID: <1394759658.80282.YahooMailBasic@web193405.mail.sg3.yahoo.com> Sorry, The below solution I thoght working was not working but was just giving array size. -------------------------------------------- On Fri, 14/3/14, Sudheer Joseph wrote: Subject: Re: [Numpy-discussion] python array To: "Discussion of Numerical Python" Date: Friday, 14 March, 2014, 1:09 AM Thank you very much Nicolas and Chris, ? ? ? ? ? ? ? ? ? ? ? ? ? ???The hint was helpful and from that I treid below steps ( a crude way I would say) and getting same result now I have been using abs available by default and it is the same with numpy.absolute( i checked). nr= ((r2010>r2010.min()) & (r2010 wrote: Subject: Re: [Numpy-discussion] python array To: "Discussion of Numerical Python" Date: Thursday, 13 March, 2014, 11:53 PM On Mar 13, 2014, at 9:39 AM, Nicolas Rougier wrote: > > Seems to be related to the masked values: Good hint -- a masked array keeps the "junk" values in the main array. What "abs" are you using -- it may not be mask-aware. ( you want a numpy abs anyway) Also -- I'm not sure I know what happens with Boolean operators on masked arrays when you use them to index. I'd investigate that. (sorry, not at a machine I can play with now) Chris > print r2010[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > print abs(r2010)[:3,:3] > [[-- -- --] > [-- -- --] > [-- -- --]] > > > print r2010[ r2010[:3,:3] <0 ] > [-- -- -- -- -- -- -- -- --] > > print r2010[ abs(r2010)[:3,:3] < 0] > [] > > Nicolas > > > > On 13 Mar 2014, at 16:52, Sudheer Joseph wrote: > >> Dear experts, >>? ? ? ? ? ? ? ? ? ???I am encountering a strange behaviour of python data array as below. I have been trying to use the data from a netcdf file(attached herewith) to do certain calculation using below code. If I take absolute value of the same array and look for values <.5? I get a different value than the original array. But the fact is that this particular case do not have any negative values in the array( but there are other files where it can have negative values so the condition is put). I do not see any reason for getting different numbers for values <.5 in case of bt and expected it to be same as that of r2010. If any one has a guess on what is behind this behaviour please help. >> >> >> In [14]: from netCDF4 import Dataset as nc >> >> In [15]: nf=nc('r2010.nc') >> In [16]: r2010=nf.variables['R2010'][:] >> In [17]: bt=abs(r2010) >> In [18]: bt[bt<=.5].shape >> Out[18]: (2872,) >> In [19]: r2010[r2010<.5].shape >> Out[19]: (36738,) >> >> >> bt.min() >> Out[20]: 0.0027588337040836768 >> >> In [21]: bt.max() >> Out[21]: 3.5078965479057089 >> In [22]: r2010.max() >> Out[22]: 3.5078965479057089 >> In [23]: r2010.min() >> Out[23]: 0.0027588337040836768 >> >> >> >> *************************************************************** >> Sudheer Joseph >> Indian National Centre for Ocean Information Services >> Ministry of Earth Sciences, Govt. of India >> POST BOX NO: 21, IDA Jeedeemetla P.O. >> Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), >> Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com >> Web- http://oppamthadathil.tripod.com >> ***************************************************************_______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From brett.olsen at gmail.com Thu Mar 13 22:07:07 2014 From: brett.olsen at gmail.com (Brett Olsen) Date: Thu, 13 Mar 2014 21:07:07 -0500 Subject: [Numpy-discussion] python array In-Reply-To: <1394759658.80282.YahooMailBasic@web193405.mail.sg3.yahoo.com> References: <1394759374.84418.YahooMailBasic@web193405.mail.sg3.yahoo.com> <1394759658.80282.YahooMailBasic@web193405.mail.sg3.yahoo.com> Message-ID: The difference appears to be that the boolean selection pulls out all data values <= 0.5 whether or not they are masked, and then carries over the appropriate masks to the new array. So r2010 and bt contain identical unmasked values but different numbers of masked values. Because the initial fill value for your masked values was a large negative number, in r2010 those masked values are carried over. In bt, you've taken the absolute value of the data array, so those fill values are now positive and they are no longer carried over into the indexed array. Because the final arrays are still masked, you are observing no difference in the statistical properties of the arrays, only their sizes, because one contains many more masked values than the other. I don't think this should be a problem for your computations. If you're concerned, you could always explicitly demask them before your computations. See the example problem below. ~Brett In [61]: import numpy as np In [62]: import numpy.ma as ma In [65]: a = np.arange(-8, 8).reshape((4, 4)) In [66]: a Out[66]: array([[-8, -7, -6, -5], [-4, -3, -2, -1], [ 0, 1, 2, 3], [ 4, 5, 6, 7]]) In [68]: b = ma.masked_array(a, mask=a < 0) In [69]: b Out[69]: masked_array(data = [[-- -- -- --] [-- -- -- --] [0 1 2 3] [4 5 6 7]], mask = [[ True True True True] [ True True True True] [False False False False] [False False False False]], fill_value = 999999) In [70]: b.data Out[70]: array([[-8, -7, -6, -5], [-4, -3, -2, -1], [ 0, 1, 2, 3], [ 4, 5, 6, 7]]) In [71]: c = abs(b) In [72]: c[c <= 4].shape Out[72]: (9L,) In [73]: b[b <= 4].shape Out[73]: (13L,) In [74]: b[b <= 4] Out[74]: masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3 4], mask = [ True True True True True True True True False False False False False], fill_value = 999999) In [75]: c[c <= 4] Out[75]: masked_array(data = [-- -- -- -- 0 1 2 3 4], mask = [ True True True True False False False False False], fill_value = 999999) On Thu, Mar 13, 2014 at 8:14 PM, Sudheer Joseph wrote: > Sorry, > The below solution I thoght working was not working but was > just giving array size. > > -------------------------------------------- > On Fri, 14/3/14, Sudheer Joseph wrote: > > Subject: Re: [Numpy-discussion] python array > To: "Discussion of Numerical Python" > Date: Friday, 14 March, 2014, 1:09 AM > > Thank you very much Nicolas and > Chris, > > The > hint was helpful and from that I treid below steps ( a crude > way I would say) and getting same result now > > I have been using abs available by default and it is the > same with numpy.absolute( i checked). > > nr= ((r2010>r2010.min()) & (r2010 nr[nr<.5].shape > Out[25]: (33868,) > anr=numpy.absolute(nr) > anr[anr<.5].shape > Out[27]: (33868,) > > This way I used may have problem when mask used has values > which can affect the min max operation. > > So I would like to know if there is a standard formal ( > python/numpy) way to handle masked array when they need to > be subjected to boolean operations. > > with best regards, > Sudheer > > > *************************************************************** > Sudheer Joseph > Indian National Centre for Ocean Information Services > Ministry of Earth Sciences, Govt. of India > POST BOX NO: 21, IDA Jeedeemetla P.O. > Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 > Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), > Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) > E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com > Web- http://oppamthadathil.tripod.com > *************************************************************** > > -------------------------------------------- > On Thu, 13/3/14, Chris Barker - NOAA Federal > wrote: > > Subject: Re: [Numpy-discussion] python array > To: "Discussion of Numerical Python" > Date: Thursday, 13 March, 2014, 11:53 PM > > On Mar 13, 2014, at 9:39 AM, Nicolas > Rougier > wrote: > > > > > Seems to be related to the masked values: > > Good hint -- a masked array keeps the "junk" values in the > main array. > > What "abs" are you using -- it may not be mask-aware. ( > you > want a > numpy abs anyway) > > Also -- I'm not sure I know what happens with Boolean > operators on > masked arrays when you use them to index. I'd investigate > that. > (sorry, not at a machine I can play with now) > > Chris > > > > print r2010[:3,:3] > > [[-- -- --] > > [-- -- --] > > [-- -- --]] > > > > print abs(r2010)[:3,:3] > > [[-- -- --] > > [-- -- --] > > [-- -- --]] > > > > > > print r2010[ r2010[:3,:3] <0 ] > > [-- -- -- -- -- -- -- -- --] > > > > print r2010[ abs(r2010)[:3,:3] < 0] > > [] > > > > Nicolas > > > > > > > > On 13 Mar 2014, at 16:52, Sudheer Joseph > wrote: > > > >> Dear experts, > >> > I am encountering a strange > behaviour of python data array as below. I have been > trying > to use the data from a netcdf file(attached herewith) to > do > certain calculation using below code. If I take absolute > value of the same array and look for values <.5 I > get a different value than the original array. But the > fact > is that this particular case do not have any negative > values > in the array( but there are other files where it can have > negative values so the condition is put). I do not see any > reason for getting different numbers for values <.5 in > case of bt and expected it to be same as that of r2010. If > any one has a guess on what is behind this behaviour > please > help. > >> > >> > >> In [14]: from netCDF4 import Dataset as nc > >> > >> In [15]: nf=nc('r2010.nc') > >> In [16]: r2010=nf.variables['R2010'][:] > >> In [17]: bt=abs(r2010) > >> In [18]: bt[bt<=.5].shape > >> Out[18]: (2872,) > >> In [19]: r2010[r2010<.5].shape > >> Out[19]: (36738,) > >> > >> > >> bt.min() > >> Out[20]: 0.0027588337040836768 > >> > >> In [21]: bt.max() > >> Out[21]: 3.5078965479057089 > >> In [22]: r2010.max() > >> Out[22]: 3.5078965479057089 > >> In [23]: r2010.min() > >> Out[23]: 0.0027588337040836768 > >> > >> > >> > >> > *************************************************************** > >> Sudheer Joseph > >> Indian National Centre for Ocean Information > Services > >> Ministry of Earth Sciences, Govt. of India > >> POST BOX NO: 21, IDA Jeedeemetla P.O. > >> Via Pragathi Nagar,Kukatpally, Hyderabad; > Pin:5000 > 55 > >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), > >> > Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) > >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com > >> Web- http://oppamthadathil.tripod.com > >> > *************************************************************** >_______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmao20001 at gmail.com Fri Mar 14 02:34:20 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Fri, 14 Mar 2014 14:34:20 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 1:43 AM, alex wrote: > > I think everyone who wants fast numpy linalg already connects to > something like OpenBLAS or MKL. When these are not available, numpy > uses its own "lapack-lite" which is way slower. I don't think you are > going to beat OpenBLAS, so are you suggesting to speed up the slow > default "lapack-lite", or are you proposing something else? > I think most CPUs nowadays support instructions like SSE2, AVX etc, so maybe numpy can use OpenBLAS (or somethine else) by default ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sudheer.joseph at yahoo.com Fri Mar 14 02:48:33 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Fri, 14 Mar 2014 14:48:33 +0800 (SGT) Subject: [Numpy-discussion] python array In-Reply-To: Message-ID: <1394779713.93522.YahooMailBasic@web193401.mail.sg3.yahoo.com> Thank you Olsen, My objective was to find out, how many values are falling under different ranges. ie, find RMS < ,5 and then rms between .5 and .8 etc. If there is a speficic python way of handling mask and making boolean operation with out any doubt, I was looking for that. The data I am using has a mask and if I wanted to tell python do not consider the masked values and masked grid points, ( a percentatge is calculated afterwards using the number of grid points) while doing the calculation. I will try in detail the example you send and see how python handles this. with best regards, Sudheer *************************************************************** Sudheer Joseph Indian National Centre for Ocean Information Services Ministry of Earth Sciences, Govt. of India POST BOX NO: 21, IDA Jeedeemetla P.O. Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com Web- http://oppamthadathil.tripod.com *************************************************************** -------------------------------------------- On Fri, 14/3/14, Brett Olsen wrote: Subject: Re: [Numpy-discussion] python array To: "Discussion of Numerical Python" Date: Friday, 14 March, 2014, 2:07 AM The difference appears to be that the boolean selection pulls out all data values <= 0.5 whether or not they are masked, and then carries over the appropriate masks to the new array. ?So r2010 and bt contain identical unmasked values but different numbers of masked values. ?Because the initial fill value for your masked values was a large negative number, in r2010 those masked values are carried over. ?In bt, you've taken the absolute value of the data array, so those fill values are now positive and they are no longer carried over into the indexed array. Because the final arrays are still masked, you are observing no difference in the statistical properties of the arrays, only their sizes, because one contains many more masked values than the other. ?I don't think this should be a problem for your computations. If you're concerned, you could always explicitly demask them before your computations. ?See the example problem below. ~Brett In [61]: import numpy as np In [62]: import numpy.ma as ma In [65]: a = np.arange(-8, 8).reshape((4, 4)) In [66]: aOut[66]:array([[-8, -7, -6, -5],? ? ? ?[-4, -3, -2, -1],? ? ? ?[ 0, ?1, ?2, ?3],? ? ? ?[ 4, ?5, ?6, ?7]]) In [68]: b = ma.masked_array(a, mask=a < 0) In [69]: b Out[69]:masked_array(data =?[[-- -- -- --]?[-- -- -- --]?[0 1 2 3]?[4 5 6 7]],? ? ? ? ? ? ?mask = ?[[ True ?True ?True ?True]?[ True ?True ?True ?True]?[False False False False]?[False False False False]],? ? ? ?fill_value = 999999) In [70]: b.data Out[70]:array([[-8, -7, -6, -5],? ? ? ?[-4, -3, -2, -1],? ? ? ?[ 0, ?1, ?2, ?3],? ? ? ?[ 4, ?5, ?6, ?7]]) In [71]: c = abs(b) In [72]: c[c <= 4].shapeOut[72]: (9L,) In [73]: b[b <= 4].shapeOut[73]: (13L,) In [74]: b[b <= 4]Out[74]:masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3 4], ? ? ? ? ? ? ?mask = [ True ?True ?True ?True ?True ?True ?True ?True False False False False?False],? ? ? ?fill_value = 999999) In [75]: c[c <= 4] Out[75]:masked_array(data = [-- -- -- -- 0 1 2 3 4],? ? ? ? ? ? ?mask = [ True ?True ?True ?True False False False False False],? ? ? ?fill_value = 999999) On Thu, Mar 13, 2014 at 8:14 PM, Sudheer Joseph wrote: Sorry, ? ? ? ? ? ?The below solution I thoght working was not working but was just giving array size. -------------------------------------------- On Fri, 14/3/14, Sudheer Joseph wrote: ?Subject: Re: [Numpy-discussion] python array ?To: "Discussion of Numerical Python" ?Date: Friday, 14 March, 2014, 1:09 AM ?Thank you very much Nicolas and ?Chris, ?? ? ? ? ? ? ? ? ?? ? ? ? ? ???The ?hint was helpful and from that I treid below steps ( a crude ?way I would say) and getting same result now ?I have been using abs available by default and it is the ?same with numpy.absolute( i checked). ?nr= ((r2010>r2010.min()) & (r2010 ?wrote: ? Subject: Re: [Numpy-discussion] python array ? To: "Discussion of Numerical Python" ? Date: Thursday, 13 March, 2014, 11:53 PM ? On Mar 13, 2014, at 9:39 AM, Nicolas ? Rougier ? wrote: ? > ? > Seems to be related to the masked values: ? Good hint -- a masked array keeps the "junk" values in the ? main array. ? What "abs" are you using -- it may not be mask-aware. ( ?you ? want a ? numpy abs anyway) ? Also -- I'm not sure I know what happens with Boolean ? operators on ? masked arrays when you use them to index. I'd investigate ? that. ? (sorry, not at a machine I can play with now) ? Chris ? > print r2010[:3,:3] ? > [[-- -- --] ? > [-- -- --] ? > [-- -- --]] ? > ? > print abs(r2010)[:3,:3] ? > [[-- -- --] ? > [-- -- --] ? > [-- -- --]] ? > ? > ? > print r2010[ r2010[:3,:3] <0 ] ? > [-- -- -- -- -- -- -- -- --] ? > ? > print r2010[ abs(r2010)[:3,:3] < 0] ? > [] ? > ? > Nicolas ? > ? > ? > ? > On 13 Mar 2014, at 16:52, Sudheer Joseph ? wrote: ? > ? >> Dear experts, ? >>? ? ? ? ? ? ? ? ? ? ???I am encountering a strange ? behaviour of python data array as below. I have been ?trying ? to use the data from a netcdf file(attached herewith) to ?do ? certain calculation using below code. If I take absolute ? value of the same array and look for values <.5? I ? get a different value than the original array. But the ?fact ? is that this particular case do not have any negative ?values ? in the array( but there are other files where it can have ? negative values so the condition is put). I do not see any ? reason for getting different numbers for values <.5 in ? case of bt and expected it to be same as that of r2010. If ? any one has a guess on what is behind this behaviour ?please ? help. ? >> ? >> ? >> In [14]: from netCDF4 import Dataset as nc ? >> ? >> In [15]: nf=nc('r2010.nc') ? >> In [16]: r2010=nf.variables['R2010'][:] ? >> In [17]: bt=abs(r2010) ? >> In [18]: bt[bt<=.5].shape ? >> Out[18]: (2872,) ? >> In [19]: r2010[r2010<.5].shape ? >> Out[19]: (36738,) ? >> ? >> ? >> bt.min() ? >> Out[20]: 0.0027588337040836768 ? >> ? >> In [21]: bt.max() ? >> Out[21]: 3.5078965479057089 ? >> In [22]: r2010.max() ? >> Out[22]: 3.5078965479057089 ? >> In [23]: r2010.min() ? >> Out[23]: 0.0027588337040836768 ? >> ? >> ? >> ? >> ? *************************************************************** ? >> Sudheer Joseph ? >> Indian National Centre for Ocean Information ? Services ? >> Ministry of Earth Sciences, Govt. of India ? >> POST BOX NO: 21, IDA Jeedeemetla P.O. ? >> Via Pragathi Nagar,Kukatpally, Hyderabad; ?Pin:5000 ? 55 ? >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), ? >> ? Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) ? >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com ? >> Web- http://oppamthadathil.tripod.com ? >> ? ***************************************************************_______________________________________________ ? >> NumPy-Discussion mailing list ? >> NumPy-Discussion at scipy.org ? >> http://mail.scipy.org/mailman/listinfo/numpy-discussion ? > ? > _______________________________________________ ? > NumPy-Discussion mailing list ? > NumPy-Discussion at scipy.org ? > http://mail.scipy.org/mailman/listinfo/numpy-discussion ? _______________________________________________ ? NumPy-Discussion mailing list ? NumPy-Discussion at scipy.org ? http://mail.scipy.org/mailman/listinfo/numpy-discussion ?_______________________________________________ ?NumPy-Discussion mailing list ?NumPy-Discussion at scipy.org ?http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -----Inline Attachment Follows----- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From sudheer.joseph at yahoo.com Fri Mar 14 03:09:33 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Fri, 14 Mar 2014 15:09:33 +0800 (SGT) Subject: [Numpy-discussion] python array In-Reply-To: Message-ID: <1394780973.78002.YahooMailBasic@web193403.mail.sg3.yahoo.com> Dear Oslen, I had a detailed look at the example you send and points I got were below a = np.arange(-8, 8).reshape((4, 4)) b = ma.masked_array(a, mask=a < 0) Out[33]: b[b<4] masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3], mask = [ True True True True True True True True False False False False], fill_value = 999999) In [34]: b[b<4].shape Out[34]: (12,) In [35]: b[b<4].data Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3]) This shows while numpy can do the bolean operation and list the data meeting the criteria( by masking the data further), it do not actually allow us get the count of data that meets the crieteria. I was interested in count. Because my objective was to find out how many numbers in the grid fall under different catagory.( <=4 , >4 & <=8 , >8<=10) etc. and find the percentage of them. Is there a way to get the counts correctly ? that is my botheration now !! with best regards, Sudheer On Fri, 14/3/14, Brett Olsen wrote: Subject: Re: [Numpy-discussion] python array To: "Discussion of Numerical Python" Date: Friday, 14 March, 2014, 2:07 AM The difference appears to be that the boolean selection pulls out all data values <= 0.5 whether or not they are masked, and then carries over the appropriate masks to the new array. ?So r2010 and bt contain identical unmasked values but different numbers of masked values. ?Because the initial fill value for your masked values was a large negative number, in r2010 those masked values are carried over. ?In bt, you've taken the absolute value of the data array, so those fill values are now positive and they are no longer carried over into the indexed array. Because the final arrays are still masked, you are observing no difference in the statistical properties of the arrays, only their sizes, because one contains many more masked values than the other. ?I don't think this should be a problem for your computations. If you're concerned, you could always explicitly demask them before your computations. ?See the example problem below. ~Brett In [61]: import numpy as np In [62]: import numpy.ma as ma In [65]: a = np.arange(-8, 8).reshape((4, 4)) In [66]: aOut[66]:array([[-8, -7, -6, -5],? ? ? ?[-4, -3, -2, -1],? ? ? ?[ 0, ?1, ?2, ?3],? ? ? ?[ 4, ?5, ?6, ?7]]) In [68]: b = ma.masked_array(a, mask=a < 0) In [69]: b Out[69]:masked_array(data =?[[-- -- -- --]?[-- -- -- --]?[0 1 2 3]?[4 5 6 7]],? ? ? ? ? ? ?mask = ?[[ True ?True ?True ?True]?[ True ?True ?True ?True]?[False False False False]?[False False False False]],? ? ? ?fill_value = 999999) In [70]: b.data Out[70]:array([[-8, -7, -6, -5],? ? ? ?[-4, -3, -2, -1],? ? ? ?[ 0, ?1, ?2, ?3],? ? ? ?[ 4, ?5, ?6, ?7]]) In [71]: c = abs(b) In [72]: c[c <= 4].shapeOut[72]: (9L,) In [73]: b[b <= 4].shapeOut[73]: (13L,) In [74]: b[b <= 4]Out[74]:masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3 4], ? ? ? ? ? ? ?mask = [ True ?True ?True ?True ?True ?True ?True ?True False False False False?False],? ? ? ?fill_value = 999999) In [75]: c[c <= 4] Out[75]:masked_array(data = [-- -- -- -- 0 1 2 3 4],? ? ? ? ? ? ?mask = [ True ?True ?True ?True False False False False False],? ? ? ?fill_value = 999999) On Thu, Mar 13, 2014 at 8:14 PM, Sudheer Joseph wrote: Sorry, ? ? ? ? ? ?The below solution I thoght working was not working but was just giving array size. -------------------------------------------- On Fri, 14/3/14, Sudheer Joseph wrote: ?Subject: Re: [Numpy-discussion] python array ?To: "Discussion of Numerical Python" ?Date: Friday, 14 March, 2014, 1:09 AM ?Thank you very much Nicolas and ?Chris, ?? ? ? ? ? ? ? ? ?? ? ? ? ? ???The ?hint was helpful and from that I treid below steps ( a crude ?way I would say) and getting same result now ?I have been using abs available by default and it is the ?same with numpy.absolute( i checked). ?nr= ((r2010>r2010.min()) & (r2010 ?wrote: ? Subject: Re: [Numpy-discussion] python array ? To: "Discussion of Numerical Python" ? Date: Thursday, 13 March, 2014, 11:53 PM ? On Mar 13, 2014, at 9:39 AM, Nicolas ? Rougier ? wrote: ? > ? > Seems to be related to the masked values: ? Good hint -- a masked array keeps the "junk" values in the ? main array. ? What "abs" are you using -- it may not be mask-aware. ( ?you ? want a ? numpy abs anyway) ? Also -- I'm not sure I know what happens with Boolean ? operators on ? masked arrays when you use them to index. I'd investigate ? that. ? (sorry, not at a machine I can play with now) ? Chris ? > print r2010[:3,:3] ? > [[-- -- --] ? > [-- -- --] ? > [-- -- --]] ? > ? > print abs(r2010)[:3,:3] ? > [[-- -- --] ? > [-- -- --] ? > [-- -- --]] ? > ? > ? > print r2010[ r2010[:3,:3] <0 ] ? > [-- -- -- -- -- -- -- -- --] ? > ? > print r2010[ abs(r2010)[:3,:3] < 0] ? > [] ? > ? > Nicolas ? > ? > ? > ? > On 13 Mar 2014, at 16:52, Sudheer Joseph ? wrote: ? > ? >> Dear experts, ? >>? ? ? ? ? ? ? ? ? ? ???I am encountering a strange ? behaviour of python data array as below. I have been ?trying ? to use the data from a netcdf file(attached herewith) to ?do ? certain calculation using below code. If I take absolute ? value of the same array and look for values <.5? I ? get a different value than the original array. But the ?fact ? is that this particular case do not have any negative ?values ? in the array( but there are other files where it can have ? negative values so the condition is put). I do not see any ? reason for getting different numbers for values <.5 in ? case of bt and expected it to be same as that of r2010. If ? any one has a guess on what is behind this behaviour ?please ? help. ? >> ? >> ? >> In [14]: from netCDF4 import Dataset as nc ? >> ? >> In [15]: nf=nc('r2010.nc') ? >> In [16]: r2010=nf.variables['R2010'][:] ? >> In [17]: bt=abs(r2010) ? >> In [18]: bt[bt<=.5].shape ? >> Out[18]: (2872,) ? >> In [19]: r2010[r2010<.5].shape ? >> Out[19]: (36738,) ? >> ? >> ? >> bt.min() ? >> Out[20]: 0.0027588337040836768 ? >> ? >> In [21]: bt.max() ? >> Out[21]: 3.5078965479057089 ? >> In [22]: r2010.max() ? >> Out[22]: 3.5078965479057089 ? >> In [23]: r2010.min() ? >> Out[23]: 0.0027588337040836768 ? >> ? >> ? >> ? >> ? *************************************************************** ? >> Sudheer Joseph ? >> Indian National Centre for Ocean Information ? Services ? >> Ministry of Earth Sciences, Govt. of India ? >> POST BOX NO: 21, IDA Jeedeemetla P.O. ? >> Via Pragathi Nagar,Kukatpally, Hyderabad; ?Pin:5000 ? 55 ? >> Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), ? >> ? Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) ? >> E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com ? >> Web- http://oppamthadathil.tripod.com ? >> ? ***************************************************************_______________________________________________ ? >> NumPy-Discussion mailing list ? >> NumPy-Discussion at scipy.org ? >> http://mail.scipy.org/mailman/listinfo/numpy-discussion ? > ? > _______________________________________________ ? > NumPy-Discussion mailing list ? > NumPy-Discussion at scipy.org ? > http://mail.scipy.org/mailman/listinfo/numpy-discussion ? _______________________________________________ ? NumPy-Discussion mailing list ? NumPy-Discussion at scipy.org ? http://mail.scipy.org/mailman/listinfo/numpy-discussion ?_______________________________________________ ?NumPy-Discussion mailing list ?NumPy-Discussion at scipy.org ?http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -----Inline Attachment Follows----- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From efiring at hawaii.edu Fri Mar 14 03:20:19 2014 From: efiring at hawaii.edu (Eric Firing) Date: Thu, 13 Mar 2014 21:20:19 -1000 Subject: [Numpy-discussion] python array In-Reply-To: <1394780973.78002.YahooMailBasic@web193403.mail.sg3.yahoo.com> References: <1394780973.78002.YahooMailBasic@web193403.mail.sg3.yahoo.com> Message-ID: <5322ADB3.3020209@hawaii.edu> On 2014/03/13 9:09 PM, Sudheer Joseph wrote: > Dear Oslen, > > I had a detailed look at the example you send and points I got were below > > a = np.arange(-8, 8).reshape((4, 4)) > b = ma.masked_array(a, mask=a < 0) > > > Out[33]: b[b<4] > masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3], > mask = [ True True True True True True True True False False False False], > fill_value = 999999) > In [34]: b[b<4].shape > Out[34]: (12,) > In [35]: b[b<4].data > Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3]) > > This shows while numpy can do the bolean operation and list the data meeting the criteria( by masking the data further), it do not actually allow us get the count of data that meets the crieteria. I was interested in count. Because my objective was to find out how many numbers in the grid fall under different catagory.( <=4 , >4 & <=8 , >8<=10) etc. and find the percentage of them. > > Is there a way to get the counts correctly ? that is my botheration now !! Certainly. If all you need are statistics of the type you describe, where you are working with a 1-D array, then extract the unmasked values into an ordinary ndarray, and work with that: a = np.random.randn(100) am = np.ma.masked_less(a, -0.2) print am.count() # number of masked values a_nomask = am.compressed() print type(a_nomask) print a_nomask.shape # number of points with value less than 0.5: print (a_nomask < 0.5).sum() # (Boolean True is 1) # Or if you want the actual array of values, not just the count: a_nomask[a_nomask < 0.5] Eric > > with best regards, > Sudheer From sudheer.joseph at yahoo.com Fri Mar 14 03:57:08 2014 From: sudheer.joseph at yahoo.com (Sudheer Joseph) Date: Fri, 14 Mar 2014 15:57:08 +0800 (SGT) Subject: [Numpy-discussion] python array In-Reply-To: <5322ADB3.3020209@hawaii.edu> Message-ID: <1394783828.3149.YahooMailBasic@web193401.mail.sg3.yahoo.com> Thank you Eric, The compress is the option which is gets the correct numbers. a = np.arange(-8, 8).reshape((4, 4)) In [67]: b = ma.masked_array(a, mask=a < 0) In [68]: bb=b.compressed() In [69]: b[b<4].size Out[69]: 12 In [70]: bb=b.compressed() In [71]: bb[bb<=4].size Out[71]: 5 with best regards, Sudheer *************************************************************** Sudheer Joseph Indian National Centre for Ocean Information Services Ministry of Earth Sciences, Govt. of India POST BOX NO: 21, IDA Jeedeemetla P.O. Via Pragathi Nagar,Kukatpally, Hyderabad; Pin:5000 55 Tel:+91-40-23886047(O),Fax:+91-40-23895011(O), Tel:+91-40-23044600(R),Tel:+91-40-9440832534(Mobile) E-mail:sjo.India at gmail.com;sudheer.joseph at yahoo.com Web- http://oppamthadathil.tripod.com *************************************************************** -------------------------------------------- On Fri, 14/3/14, Eric Firing wrote: Subject: Re: [Numpy-discussion] python array To: numpy-discussion at scipy.org Date: Friday, 14 March, 2014, 7:20 AM On 2014/03/13 9:09 PM, Sudheer Joseph wrote: > Dear Oslen, > > I had? a detailed look at the example you send and points I got were below > > a = np.arange(-8, 8).reshape((4, 4)) > b = ma.masked_array(a, mask=a < 0) > > > Out[33]: b[b<4] > masked_array(data = [-- -- -- -- -- -- -- -- 0 1 2 3], >? ? ? ? ? ? ???mask = [ True? True? True? True? True? True? True? True False False False False], >? ? ? ???fill_value = 999999) > In [34]: b[b<4].shape > Out[34]: (12,) > In [35]: b[b<4].data > Out[35]: array([-8, -7, -6, -5, -4, -3, -2, -1,? 0,? 1,? 2,? 3]) > > This shows while numpy can do the bolean operation and list the data meeting the criteria( by masking the data further), it do not actually allow us get the count of data that meets the crieteria. I was interested in count. Because my objective was to find out how many numbers in the grid fall under different catagory.( <=4 , >4 & <=8 , >8<=10) etc. and find the percentage of them. > >???Is there a way to get the counts correctly ? that is my botheration now !! Certainly.? If all you need are statistics of the type you describe, where you are working with a 1-D array, then extract the unmasked values into an ordinary ndarray, and work with that: a = np.random.randn(100) am = np.ma.masked_less(a, -0.2) print am.count()? # number of masked values a_nomask = am.compressed() print type(a_nomask) print a_nomask.shape # number of points with value less than 0.5: print (a_nomask < 0.5).sum() # (Boolean True is 1) # Or if you want the actual array of values, not just the count: a_nomask[a_nomask < 0.5] Eric > > with best regards, > Sudheer _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion From gregor.thalhammer at gmail.com Fri Mar 14 05:05:15 2014 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Fri, 14 Mar 2014 10:05:15 +0100 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> Am 13.03.2014 um 18:35 schrieb Leo Mao : > Hi, > > Thanks a lot for your advice, Chuck. > Following your advice, I have modified my draft of proposal. (attachment) > I think it still needs more comments so that I can make it better. > > And I found that maybe I can also make some functions related to linalg (like dot, svd or something else) faster by integrating a proper library into numpy. > > Regards, > Leo Mao > Dear Leo, large parts of your proposal are covered by the uvml package https://github.com/geggo/uvml In my opinion you should also consider Intels VML (part of MKL) as a candidate. (Yes I know, it is not free). To my best knowledge it provides many more vectorized functions than the open source alternatives. Concerning your time table, once you implemented support for one function, adding more functions is very easy. Gregor From ewm at redtetrahedron.org Fri Mar 14 06:00:17 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Fri, 14 Mar 2014 06:00:17 -0400 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> Message-ID: On Friday, March 14, 2014, Gregor Thalhammer wrote: > > Am 13.03.2014 um 18:35 schrieb Leo Mao > >: > > > Hi, > > > > Thanks a lot for your advice, Chuck. > > Following your advice, I have modified my draft of proposal. (attachment) > > I think it still needs more comments so that I can make it better. > > > > And I found that maybe I can also make some functions related to linalg > (like dot, svd or something else) faster by integrating a proper library > into numpy. > > > > Regards, > > Leo Mao > > > Dear Leo, > > large parts of your proposal are covered by the uvml package > https://github.com/geggo/uvml > In my opinion you should also consider Intels VML (part of MKL) as a > candidate. (Yes I know, it is not free). To my best knowledge it provides > many more vectorized functions than the open source alternatives. > Concerning your time table, once you implemented support for one function, > adding more functions is very easy. > > Gregor > > I'm not sure that your week old project is enough to discourage this gsoc project. In particular, it would be nice to be able to ship this directly as part of numpy and that won't really be possible with mlk. Eric > __________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Fri Mar 14 08:57:35 2014 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Fri, 14 Mar 2014 13:57:35 +0100 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> Message-ID: <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> Am 14.03.2014 um 11:00 schrieb Eric Moore : > > > On Friday, March 14, 2014, Gregor Thalhammer wrote: > > Am 13.03.2014 um 18:35 schrieb Leo Mao : > > > Hi, > > > > Thanks a lot for your advice, Chuck. > > Following your advice, I have modified my draft of proposal. (attachment) > > I think it still needs more comments so that I can make it better. > > > > And I found that maybe I can also make some functions related to linalg (like dot, svd or something else) faster by integrating a proper library into numpy. > > > > Regards, > > Leo Mao > > > Dear Leo, > > large parts of your proposal are covered by the uvml package > https://github.com/geggo/uvml > In my opinion you should also consider Intels VML (part of MKL) as a candidate. (Yes I know, it is not free). To my best knowledge it provides many more vectorized functions than the open source alternatives. > Concerning your time table, once you implemented support for one function, adding more functions is very easy. > > Gregor > > > I'm not sure that your week old project is enough to discourage this gsoc project. In particular, it would be nice to be able to ship this directly as part of numpy and that won't really be possible with mlk. > > Eric > Hi, it's not at all my intention to discourage this project. I hope Leo Mao can use the uvml package as a starting point for further improvements. Since most vectorized math libraries share a very similar interface, I think the actual choice of the library could be made a configurable option. Adapting uvml to use e.g. yeppp instead of MKL should be straightforward. Similar to numpy or scipy built with MKL lapack and distributed by enthought or Christoph Gohlke, using MKL should not be ruled out completely. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Fri Mar 14 09:20:32 2014 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 14 Mar 2014 09:20:32 -0400 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> Message-ID: Just a comment, supporting a library that is bsd 3 clauses could help to higly reduce the compilation problem like what we have with blas. We could just include it in numpy/download it automatically or whatever to make the install trivial and then we could suppose all users have it. Deadling with blas is already not fun, if new dependency could be trivial to link to, it would be great. Fred On Fri, Mar 14, 2014 at 8:57 AM, Gregor Thalhammer wrote: > > Am 14.03.2014 um 11:00 schrieb Eric Moore : > > > > On Friday, March 14, 2014, Gregor Thalhammer > wrote: >> >> >> Am 13.03.2014 um 18:35 schrieb Leo Mao : >> >> > Hi, >> > >> > Thanks a lot for your advice, Chuck. >> > Following your advice, I have modified my draft of proposal. >> > (attachment) >> > I think it still needs more comments so that I can make it better. >> > >> > And I found that maybe I can also make some functions related to linalg >> > (like dot, svd or something else) faster by integrating a proper library >> > into numpy. >> > >> > Regards, >> > Leo Mao >> > >> Dear Leo, >> >> large parts of your proposal are covered by the uvml package >> https://github.com/geggo/uvml >> In my opinion you should also consider Intels VML (part of MKL) as a >> candidate. (Yes I know, it is not free). To my best knowledge it provides >> many more vectorized functions than the open source alternatives. >> Concerning your time table, once you implemented support for one function, >> adding more functions is very easy. >> >> Gregor >> > > I'm not sure that your week old project is enough to discourage this gsoc > project. In particular, it would be nice to be able to ship this directly as > part of numpy and that won't really be possible with mlk. > > Eric > > > > Hi, > > it's not at all my intention to discourage this project. I hope Leo Mao can > use the uvml package as a starting point for further improvements. Since > most vectorized math libraries share a very similar interface, I think the > actual choice of the library could be made a configurable option. Adapting > uvml to use e.g. yeppp instead of MKL should be straightforward. Similar to > numpy or scipy built with MKL lapack and distributed by enthought or > Christoph Gohlke, using MKL should not be ruled out completely. > > Gregor > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From andrew.collette at gmail.com Fri Mar 14 11:26:03 2014 From: andrew.collette at gmail.com (Andrew Collette) Date: Fri, 14 Mar 2014 09:26:03 -0600 Subject: [Numpy-discussion] ANN: HDF5 for Python 2.3.0 BETA Message-ID: Announcing HDF5 for Python (h5py) 2.3.0 BETA ============================================ The h5py team is happy to announce the availability of h5py 2.3.0 beta. This beta release will be available for approximately two weeks. What's h5py? ------------ The h5py package is a Pythonic interface to the HDF5 binary data format. It lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want. Changes ------- This release introduces some important new features, including: * Support for arbitrary vlen data * Improved exception messages * Improved setuptools support * Multiple additions to the low-level API * Improved support for MPI features * Single-step build for HDF5 on Windows A complete description of changes is available online: http://docs.h5py.org/en/latest/whatsnew/2.3.html Where to get it --------------- Downloads, documentation, and more are available at the h5py website: http://www.h5py.org Acknowledgements ---------------- The h5py package relies on third-party testing and contributions. For the 2.3 release, thanks especially to: * Martin Teichmann * Florian Rathgerber * Pierre de Buyl * Thomas Caswell * Andy Salnikov * Darren Dale * Robert David Grant * Toon Verstraelen * Many others who contributed bug reports and testing From lmao20001 at gmail.com Fri Mar 14 12:33:47 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Sat, 15 Mar 2014 00:33:47 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> Message-ID: Hi everyone, Thanks for your relies! I think Gregor's uvml package is really a good starting point for me. I think the actual choice of the library could be made a configurable > option. > Sounds like a good idea? If the implementations are very similar, maybe I can implement multiple libraries bindings? A potential issue is that some libraries may lack some functions. For example, Yeppp is a good candidates as long as it provides pre-build libraries on many platforms and its API is pretty clear. But Yeppp lacks some functions like inverse trigonometric functions. Intels VML provides much more functions but sadly it is not free. I found another library called Vc, which looks like a potential candidates for this project: http://code.compeng.uni-frankfurt.de/projects/vc I haven't digged into it yet so I'm not sure if it provides what we want. supporting a library that is bsd 3 clauses could help > to higly reduce the compilation problem like what we have with blas. > Yeppp is bsd 3 clauses so I think Yeppp is really a good choice. Is there a list of licenses which can be added into numpy without pain? (how about LGPL3 ?) Regards, Leo Mao On Fri, Mar 14, 2014 at 9:20 PM, Fr?d?ric Bastien wrote: > Just a comment, supporting a library that is bsd 3 clauses could help > to higly reduce the compilation problem like what we have with blas. > We could just include it in numpy/download it automatically or > whatever to make the install trivial and then we could suppose all > users have it. Deadling with blas is already not fun, if new > dependency could be trivial to link to, it would be great. > > Fred > > On Fri, Mar 14, 2014 at 8:57 AM, Gregor Thalhammer > wrote: > > > > Am 14.03.2014 um 11:00 schrieb Eric Moore : > > > > > > > > On Friday, March 14, 2014, Gregor Thalhammer < > gregor.thalhammer at gmail.com> > > wrote: > >> > >> > >> Am 13.03.2014 um 18:35 schrieb Leo Mao : > >> > >> > Hi, > >> > > >> > Thanks a lot for your advice, Chuck. > >> > Following your advice, I have modified my draft of proposal. > >> > (attachment) > >> > I think it still needs more comments so that I can make it better. > >> > > >> > And I found that maybe I can also make some functions related to > linalg > >> > (like dot, svd or something else) faster by integrating a proper > library > >> > into numpy. > >> > > >> > Regards, > >> > Leo Mao > >> > > >> Dear Leo, > >> > >> large parts of your proposal are covered by the uvml package > >> https://github.com/geggo/uvml > >> In my opinion you should also consider Intels VML (part of MKL) as a > >> candidate. (Yes I know, it is not free). To my best knowledge it > provides > >> many more vectorized functions than the open source alternatives. > >> Concerning your time table, once you implemented support for one > function, > >> adding more functions is very easy. > >> > >> Gregor > >> > > > > I'm not sure that your week old project is enough to discourage this gsoc > > project. In particular, it would be nice to be able to ship this > directly as > > part of numpy and that won't really be possible with mlk. > > > > Eric > > > > > > > > Hi, > > > > it's not at all my intention to discourage this project. I hope Leo Mao > can > > use the uvml package as a starting point for further improvements. Since > > most vectorized math libraries share a very similar interface, I think > the > > actual choice of the library could be made a configurable option. > Adapting > > uvml to use e.g. yeppp instead of MKL should be straightforward. Similar > to > > numpy or scipy built with MKL lapack and distributed by enthought or > > Christoph Gohlke, using MKL should not be ruled out completely. > > > > Gregor > > > > > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 14 12:40:53 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Mar 2014 16:40:53 +0000 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> Message-ID: On Fri, Mar 14, 2014 at 4:33 PM, Leo Mao wrote: > Yeppp is bsd 3 clauses so I think Yeppp is really a good choice. > Is there a list of licenses which can be added into numpy without pain? (how > about LGPL3 ?) No, just BSD and its rough equivalents like the Expat license. -- Robert Kern From jtaylor.debian at googlemail.com Fri Mar 14 15:42:35 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 14 Mar 2014 20:42:35 +0100 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> Message-ID: <53235BAB.5090500@googlemail.com> On 14.03.2014 17:40, Robert Kern wrote: > On Fri, Mar 14, 2014 at 4:33 PM, Leo Mao wrote: > >> Yeppp is bsd 3 clauses so I think Yeppp is really a good choice. >> Is there a list of licenses which can be added into numpy without pain? (how >> about LGPL3 ?) > > No, just BSD and its rough equivalents like the Expat license. > They can't be added into numpy, but support linking or building against a non-bsd like library can still be added. Only our binary distributions are limited in what we can use. From robert.kern at gmail.com Fri Mar 14 15:44:59 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 14 Mar 2014 19:44:59 +0000 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: <53235BAB.5090500@googlemail.com> References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> <53235BAB.5090500@googlemail.com> Message-ID: On Fri, Mar 14, 2014 at 7:42 PM, Julian Taylor wrote: > On 14.03.2014 17:40, Robert Kern wrote: >> On Fri, Mar 14, 2014 at 4:33 PM, Leo Mao wrote: >> >>> Yeppp is bsd 3 clauses so I think Yeppp is really a good choice. >>> Is there a list of licenses which can be added into numpy without pain? (how >>> about LGPL3 ?) >> >> No, just BSD and its rough equivalents like the Expat license. > > They can't be added into numpy, but support linking or building against > a non-bsd like library can still be added. > Only our binary distributions are limited in what we can use. Optionally, yes. -- Robert Kern From njs at pobox.com Fri Mar 14 20:51:45 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 00:51:45 +0000 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator Message-ID: Well, that was fast. Guido says he'll accept the addition of '@' as an infix operator for matrix multiplication, once some details are ironed out: https://mail.python.org/pipermail/python-ideas/2014-March/027109.html http://legacy.python.org/dev/peps/pep-0465/ Specifically, we need to figure out whether we want to make an argument for a matrix power operator ("@@"), and what precedence/associativity we want '@' to have. I'll post two separate threads to get feedback on those in an organized way -- this is just a heads-up. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From aron at ahmadia.net Fri Mar 14 20:57:58 2014 From: aron at ahmadia.net (Aron Ahmadia) Date: Fri, 14 Mar 2014 20:57:58 -0400 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: That's the best news I've had all week. Thanks for all your work on this Nathan. -A On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith wrote: > Well, that was fast. Guido says he'll accept the addition of '@' as an > infix operator for matrix multiplication, once some details are ironed > out: > https://mail.python.org/pipermail/python-ideas/2014-March/027109.html > http://legacy.python.org/dev/peps/pep-0465/ > > Specifically, we need to figure out whether we want to make an > argument for a matrix power operator ("@@"), and what > precedence/associativity we want '@' to have. I'll post two separate > threads to get feedback on those in an organized way -- this is just a > heads-up. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nouiz at nouiz.org Fri Mar 14 21:01:08 2014 From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=) Date: Fri, 14 Mar 2014 21:01:08 -0400 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: This is great news. Excellent work Nathaniel and all others! Fr?d?ric On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia wrote: > That's the best news I've had all week. > > Thanks for all your work on this Nathan. > > -A > > > On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith wrote: >> >> Well, that was fast. Guido says he'll accept the addition of '@' as an >> infix operator for matrix multiplication, once some details are ironed >> out: >> https://mail.python.org/pipermail/python-ideas/2014-March/027109.html >> http://legacy.python.org/dev/peps/pep-0465/ >> >> Specifically, we need to figure out whether we want to make an >> argument for a matrix power operator ("@@"), and what >> precedence/associativity we want '@' to have. I'll post two separate >> threads to get feedback on those in an organized way -- this is just a >> heads-up. >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From projetmbc at gmail.com Fri Mar 14 22:16:12 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Sat, 15 Mar 2014 03:16:12 +0100 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: This id good for Numpyists but this will be another operator that good also help in another contexts. As a math user, I was first very skeptical but finally this is a good news for non Numpyists too. Christophe BAL Le 15 mars 2014 02:01, "Fr?d?ric Bastien" a ?crit : > This is great news. Excellent work Nathaniel and all others! > > Fr?d?ric > > On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia wrote: > > That's the best news I've had all week. > > > > Thanks for all your work on this Nathan. > > > > -A > > > > > > On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith wrote: > >> > >> Well, that was fast. Guido says he'll accept the addition of '@' as an > >> infix operator for matrix multiplication, once some details are ironed > >> out: > >> https://mail.python.org/pipermail/python-ideas/2014-March/027109.html > >> http://legacy.python.org/dev/peps/pep-0465/ > >> > >> Specifically, we need to figure out whether we want to make an > >> argument for a matrix power operator ("@@"), and what > >> precedence/associativity we want '@' to have. I'll post two separate > >> threads to get feedback on those in an organized way -- this is just a > >> heads-up. > >> > >> -n > >> > >> -- > >> Nathaniel J. Smith > >> Postdoctoral researcher - Informatics - University of Edinburgh > >> http://vorpus.org > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.laumann at gmail.com Fri Mar 14 23:18:46 2014 From: chris.laumann at gmail.com (Chris Laumann) Date: Fri, 14 Mar 2014 20:18:46 -0700 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: That?s great.? Does this mean that, in the not-so-distant future, the matrix class will go the way of the dodos? I have had more subtle to fix bugs sneak into code b/c something returns a matrix instead of an array than almost any other single source I can think of. Having two almost indistinguishable types for 2d arrays with slightly different semantics for a small subset of operations is terrible. Best, C --? Chris Laumann Sent with Airmail On March 14, 2014 at 7:16:24 PM, Christophe Bal (projetmbc at gmail.com) wrote: This id good for Numpyists but this will be another operator that good also help in another contexts. As a math user, I was first very skeptical but finally this is a good news for non Numpyists too. Christophe BAL Le 15 mars 2014 02:01, "Fr?d?ric Bastien" a ?crit : This is great news. Excellent work Nathaniel and all others! Fr?d?ric On Fri, Mar 14, 2014 at 8:57 PM, Aron Ahmadia wrote: > That's the best news I've had all week. > > Thanks for all your work on this Nathan. > > -A > > > On Fri, Mar 14, 2014 at 8:51 PM, Nathaniel Smith wrote: >> >> Well, that was fast. Guido says he'll accept the addition of '@' as an >> infix operator for matrix multiplication, once some details are ironed >> out: >> ? https://mail.python.org/pipermail/python-ideas/2014-March/027109.html >> ? http://legacy.python.org/dev/peps/pep-0465/ >> >> Specifically, we need to figure out whether we want to make an >> argument for a matrix power operator ("@@"), and what >> precedence/associativity we want '@' to have. I'll post two separate >> threads to get feedback on those in an organized way -- this is just a >> heads-up. >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 14 23:41:50 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 03:41:50 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' Message-ID: Hi all, Here's the main blocker for adding a matrix multiply operator '@' to Python: we need to decide what we think its precedence and associativity should be. I'll explain what that means so we're on the same page, and what the choices are, and then we can all argue about it. But even better would be if we could get some data to guide our decision, and this would be a lot easier if some of you all can help; I'll suggest some ways you might be able to do that. So! Precedence and left- versus right-associativity. If you already know what these are you can skim down until you see CAPITAL LETTERS. We all know what precedence is. Code like this: a + b * c gets evaluated as: a + (b * c) because * has higher precedence than +. It "binds more tightly", as they say. Python's complete precedence able is here: http://docs.python.org/3/reference/expressions.html#operator-precedence Associativity, in the parsing sense, is less well known, though it's just as important. It's about deciding how to evaluate code like this: a * b * c Do we use a * (b * c) # * is "right associative" or (a * b) * c # * is "left associative" ? Here all the operators have the same precedence (because, uh... they're the same operator), so precedence doesn't help. And mostly we can ignore this in day-to-day life, because both versions give the same answer, so who cares. But a programming language has to pick one (consider what happens if one of those objects has a non-default __mul__ implementation). And of course it matters a lot for non-associative operations like a - b - c or a / b / c So when figuring out order of evaluations, what you do first is check the precedence, and then if you have multiple operators next to each other with the same precedence, you check their associativity. Notice that this means that if you have different operators that share the same precedence level (like + and -, or * and /), then they have to all have the same associativity. All else being equal, it's generally considered nice to have fewer precedence levels, because these have to be memorized by users. Right now in Python, every precedence level is left-associative, except for '**'. If you write these formulas without any parentheses, then what the interpreter will actually execute is: (a * b) * c (a - b) - c (a / b) / c but a ** (b ** c) Okay, that's the background. Here's the question. We need to decide on precedence and associativity for '@'. In particular, there are three different options that are interesting: OPTION 1 FOR @: Precedence: same as * Associativity: left My shorthand name for it: "same-left" (yes, very creative) This means that if you don't use parentheses, you get: a @ b @ c -> (a @ b) @ c a * b @ c -> (a * b) @ c a @ b * c -> (a @ b) * c OPTION 2 FOR @: Precedence: more-weakly-binding than * Associativity: right My shorthand name for it: "weak-right" This means that if you don't use parentheses, you get: a @ b @ c -> a @ (b @ c) a * b @ c -> (a * b) @ c a @ b * c -> a @ (b * c) OPTION 3 FOR @: Precedence: more-tightly-binding than * Associativity: right My shorthand name for it: "tight-right" This means that if you don't use parentheses, you get: a @ b @ c -> a @ (b @ c) a * b @ c -> a * (b @ c) a @ b * c -> (a @ b) * c We need to pick which of which options we think is best, based on whatever reasons we can think of, ideally more than "hmm, weak-right gives me warm fuzzy feelings" ;-). (In principle the other 2 possible options are tight-left and weak-left, but there doesn't seem to be any argument in favor of either, so we'll leave them out of the discussion.) Some things to consider: * and @ are actually not associative (in the math sense) with respect to each other, i.e., (a * b) @ c and a * (b @ c) in general give different results when 'a' is not a scalar. So considering the two expressions 'a * b @ c' and 'a @ b * c', we can see that each of these three options gives produces different results in some cases. "Same-left" is the easiest to explain and remember, because it's just, "@ acts like * and /". So we already have to know the rule in order to understand other non-associative expressions like a / b / c or a - b - c, and it'd be nice if the same rule applied to things like a * b @ c so we only had to memorize *one* rule. (Of course there's ** which uses the opposite rule, but I guess everyone internalized that one in secondary school; that's not true for * versus @.) This is definitely the default we should choose unless we have a good reason to do otherwise. BUT: there might indeed be a good reason to do otherwise, which is the whole reason this has come up. Consider: Mat1 @ Mat2 @ vec Obviously this will execute much more quickly if we do Mat1 @ (Mat2 @ vec) because that results in two cheap matrix-vector multiplies, while (Mat1 @ Mat2) @ vec starts out by doing an expensive matrix-matrix multiply. So: maybe @ should be right associative, so that we get the fast behaviour without having to use explicit parentheses! /If/ these kinds of expressions are common enough that having to remember to put explicit parentheses in all the time is more of a programmer burden than having to memorize a special associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in practice -- I don't know. Also, if we do want @ to be right associative, then I can't think of any clever reasons to prefer weak-right over tight-right, or vice-versa. For the scalar multiplication case, I believe both options produce the same result in the same amount of time. For the non-scalar case, they give different answers. Do people have strong intuitions about what expressions like a * b @ c a @ b * c should do actually? (I'm guessing not, but hey, you never know.) And, while intuition is useful, it would be really *really* nice to be basing these decisions on more than *just* intuition, since whatever we decide will be subtly influencing the experience of writing linear algebra code in Python for the rest of time. So here's where I could use some help. First, of course, if you have any other reasons why one or the other of these options is better, then please share! But second, I think we need to know something about how often the Mat @ Mat @ vec type cases arise in practice. How often do non-scalar * and np.dot show up in the same expression? How often does it look like a * np.dot(b, c), and how often does it look like np.dot(a * b, c)? How often do we see expressions like np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, np.dot(b, c))? This would really help guide the debate. I don't have this data, and I'm not sure the best way to get it. A super-fancy approach would be to write a little script that uses the 'ast' module to count things automatically. A less fancy approach would be to just pick some code you've written, or a well-known package, grep through for calls to 'dot', and make notes on what you see. (An advantage of the less-fancy approach is that as a human you might be able to tell the difference between scalar and non-scalar *, or check whether it actually matters what order the 'dot' calls are done in.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 14 23:48:13 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 03:48:13 +0000 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 3:18 AM, Chris Laumann wrote: > > That?s great. > > Does this mean that, in the not-so-distant future, the matrix class will go the way of the dodos? I have had more subtle to fix bugs sneak into code b/c something returns a matrix instead of an array than almost any other single source I can think of. Having two almost indistinguishable types for 2d arrays with slightly different semantics for a small subset of operations is terrible. Well, it depends on what your definition of "distant" is :-). Py 3.5 won't be out for some time (3.*4* is coming out this week). And we'll still need to sit down and figure out if there's any bits of matrix we want to save (e.g., maybe create an ndarray version of the parser used for np.matrix("1 2; 3 4")), come up with a transition plan, have a long mailing list argument about it, etc. But the goal (IMO) is definitely to get rid of np.matrix as soon as reasonable given these considerations, and similarly to find a way to switch scipy.sparse matrices to a more ndarray-like API. So it'll be a few years at least, but I think we'll get there. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From chris.laumann at gmail.com Sat Mar 15 00:15:40 2014 From: chris.laumann at gmail.com (Chris Laumann) Date: Fri, 14 Mar 2014 21:15:40 -0700 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Hi all, Let me preface my two cents by saying that I think the best part of @ being accepted is the potential for deprecating the matrix class ? the syntactic beauty of infix for matrix multiply is a nice side effect IMHO :) This may be why my basic attitude is: I don?t think it matters very much but I would vote (weakly) for weak-right. Where there is ambiguity, I suspect most practitioners will just put in parentheses anyway ? especially with combinations of * and @, where I don?t think there is a natural intuitive precedence relationship. At least, element-wise multiplication is very rare in math/physics texts as an explicitly defined elementary operation so I?d be surprised if anybody had a strong intuition about the precedence of the ?*? operator. And the binding order doesn?t matter if it is scalar multiplication. I have quite a bit of code with large matrices where the order of matrix-vector multiplies is an important optimization and I would certainly have a few simpler looking expressions for op @ op @ vec, hence the weak preference for right-associativity. That said, I routinely come across situations where the optimal matrix multiplication order is more complicated than can be expressed as left-right or right-left (because some matrices might be diagonal, CSR or CSC), which is why the preference is only weak. I don?t see a down-side in the use-case that it is actually associative (as in matrix-matrix-vector).? Best, Chris --? Chris Laumann Sent with Airmail On March 14, 2014 at 8:42:00 PM, Nathaniel Smith (njs at pobox.com) wrote: Hi all, Here's the main blocker for adding a matrix multiply operator '@' to Python: we need to decide what we think its precedence and associativity should be. I'll explain what that means so we're on the same page, and what the choices are, and then we can all argue about it. But even better would be if we could get some data to guide our decision, and this would be a lot easier if some of you all can help; I'll suggest some ways you might be able to do that. So! Precedence and left- versus right-associativity. If you already know what these are you can skim down until you see CAPITAL LETTERS. We all know what precedence is. Code like this: ? a + b * c gets evaluated as: ? a + (b * c) because * has higher precedence than +. It "binds more tightly", as they say. Python's complete precedence able is here: ? http://docs.python.org/3/reference/expressions.html#operator-precedence Associativity, in the parsing sense, is less well known, though it's just as important. It's about deciding how to evaluate code like this: ? a * b * c Do we use ? a * (b * c) ? ?# * is "right associative" or ? (a * b) * c ? ?# * is "left associative" ? Here all the operators have the same precedence (because, uh... they're the same operator), so precedence doesn't help. And mostly we can ignore this in day-to-day life, because both versions give the same answer, so who cares. But a programming language has to pick one (consider what happens if one of those objects has a non-default __mul__ implementation). And of course it matters a lot for non-associative operations like ? a - b - c or ? a / b / c So when figuring out order of evaluations, what you do first is check the precedence, and then if you have multiple operators next to each other with the same precedence, you check their associativity. Notice that this means that if you have different operators that share the same precedence level (like + and -, or * and /), then they have to all have the same associativity. All else being equal, it's generally considered nice to have fewer precedence levels, because these have to be memorized by users. Right now in Python, every precedence level is left-associative, except for '**'. If you write these formulas without any parentheses, then what the interpreter will actually execute is: ? (a * b) * c ? (a - b) - c ? (a / b) / c but ? a ** (b ** c) Okay, that's the background. Here's the question. We need to decide on precedence and associativity for '@'. In particular, there are three different options that are interesting: OPTION 1 FOR @: Precedence: same as * Associativity: left My shorthand name for it: "same-left" (yes, very creative) This means that if you don't use parentheses, you get: ?? a @ b @ c? ->? (a @ b) @ c ?? a * b @ c? ->? (a * b) @ c ?? a @ b * c? ->? (a @ b) * c OPTION 2 FOR @: Precedence: more-weakly-binding than * Associativity: right My shorthand name for it: "weak-right" This means that if you don't use parentheses, you get: ?? a @ b @ c? ->? a @ (b @ c) ?? a * b @ c? ->? (a * b) @ c ?? a @ b * c? ->? a @ (b * c) OPTION 3 FOR @: Precedence: more-tightly-binding than * Associativity: right My shorthand name for it: "tight-right" This means that if you don't use parentheses, you get: ?? a @ b @ c? ->? a @ (b @ c) ?? a * b @ c? ->? a * (b @ c) ?? a @ b * c? ->? (a @ b) * c We need to pick which of which options we think is best, based on whatever reasons we can think of, ideally more than "hmm, weak-right gives me warm fuzzy feelings" ;-). (In principle the other 2 possible options are tight-left and weak-left, but there doesn't seem to be any argument in favor of either, so we'll leave them out of the discussion.) Some things to consider: * and @ are actually not associative (in the math sense) with respect to each other, i.e., (a * b) @ c and a * (b @ c) in general give different results when 'a' is not a scalar. So considering the two expressions 'a * b @ c' and 'a @ b * c', we can see that each of these three options gives produces different results in some cases. "Same-left" is the easiest to explain and remember, because it's just, "@ acts like * and /". So we already have to know the rule in order to understand other non-associative expressions like a / b / c or a - b - c, and it'd be nice if the same rule applied to things like a * b @ c so we only had to memorize *one* rule. (Of course there's ** which uses the opposite rule, but I guess everyone internalized that one in secondary school; that's not true for * versus @.) This is definitely the default we should choose unless we have a good reason to do otherwise. BUT: there might indeed be a good reason to do otherwise, which is the whole reason this has come up. Consider: ? ? Mat1 @ Mat2 @ vec Obviously this will execute much more quickly if we do ? ? Mat1 @ (Mat2 @ vec) because that results in two cheap matrix-vector multiplies, while ? ? (Mat1 @ Mat2) @ vec starts out by doing an expensive matrix-matrix multiply. So: maybe @ should be right associative, so that we get the fast behaviour without having to use explicit parentheses! /If/ these kinds of expressions are common enough that having to remember to put explicit parentheses in all the time is more of a programmer burden than having to memorize a special associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in practice -- I don't know. Also, if we do want @ to be right associative, then I can't think of any clever reasons to prefer weak-right over tight-right, or vice-versa. For the scalar multiplication case, I believe both options produce the same result in the same amount of time. For the non-scalar case, they give different answers. Do people have strong intuitions about what expressions like ? a * b @ c ? a @ b * c should do actually? (I'm guessing not, but hey, you never know.) And, while intuition is useful, it would be really *really* nice to be basing these decisions on more than *just* intuition, since whatever we decide will be subtly influencing the experience of writing linear algebra code in Python for the rest of time. So here's where I could use some help. First, of course, if you have any other reasons why one or the other of these options is better, then please share! But second, I think we need to know something about how often the Mat @ Mat @ vec type cases arise in practice. How often do non-scalar * and np.dot show up in the same expression? How often does it look like a * np.dot(b, c), and how often does it look like np.dot(a * b, c)? How often do we see expressions like np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, np.dot(b, c))? This would really help guide the debate. I don't have this data, and I'm not sure the best way to get it. A super-fancy approach would be to write a little script that uses the 'ast' module to count things automatically. A less fancy approach would be to just pick some code you've written, or a well-known package, grep through for calls to 'dot', and make notes on what you see. (An advantage of the less-fancy approach is that as a human you might be able to tell the difference between scalar and non-scalar *, or check whether it actually matters what order the 'dot' calls are done in.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From travis at continuum.io Sat Mar 15 00:25:04 2014 From: travis at continuum.io (Travis Oliphant) Date: Fri, 14 Mar 2014 23:25:04 -0500 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: Congratulations Nathaniel! This is great news! Well done on starting the process and taking things forward. Travis On Mar 14, 2014 7:51 PM, "Nathaniel Smith" wrote: > Well, that was fast. Guido says he'll accept the addition of '@' as an > infix operator for matrix multiplication, once some details are ironed > out: > https://mail.python.org/pipermail/python-ideas/2014-March/027109.html > http://legacy.python.org/dev/peps/pep-0465/ > > Specifically, we need to figure out whether we want to make an > argument for a matrix power operator ("@@"), and what > precedence/associativity we want '@' to have. I'll post two separate > threads to get feedback on those in an organized way -- this is just a > heads-up. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 15 00:32:00 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 04:32:00 +0000 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? Message-ID: Hi all, Here's the second thread for discussion about Guido's concerns about PEP 465. The issue here is that PEP 465 as currently written proposes two new operators, @ for matrix multiplication and @@ for matrix power (analogous to * and **): http://legacy.python.org/dev/peps/pep-0465/ The main thing we care about of course is @; I pushed for including @@ because I thought it was nicer to have than not, and I thought the analogy between * and ** might make the overall package more appealing to Guido's aesthetic sense. It turns out I was wrong :-). Guido is -0 on @@, but willing to be swayed if we think it's worth the trouble to make a solid case. Note that question now is *not*, how will @@ affect the reception of @. @ itself is AFAICT a done deal, regardless of what happens with @@. For this discussion let's assume @ can be taken for granted, and that we can freely choose to either add @@ or not add @@ to the language. The question is: which do we think makes Python a better language (for us and in general)? Some thoughts to start us off: Here are the interesting use cases for @@ that I can think of: - 'vector @@ 2' gives the squared Euclidean length (because it's the same as vector @ vector). Kind of handy. - 'matrix @@ n' of course gives the matrix power, which is of marginal use but does come in handy sometimes, e.g., when looking at graph connectivity. - 'matrix @@ -1' provides a very transparent notation for translating textbook formulas (with all their inverses) into code. It's a bit unhelpful in practice, because (a) usually you should use solve(), and (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But sometimes transparent notation may be important. (And in some cases, like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be compiled into a call to solve() anyway.) (Did I miss any?) In practice it seems to me that the last use case is the one that's might matter a lot practice, but then again, it might not -- I'm not sure. For example, does anyone who teaches programming with numpy have a feeling about whether the existence of '@@ -1' would make a big difference to you and your students? (Alan? I know you were worried about losing the .I attribute on matrices if switching to ndarrays for teaching -- given that ndarray will probably not get a .I attribute, how much would the existence of @@ -1 affect you?) On a more technical level, Guido is worried about how @@'s precedence should work (and this is somewhat related to the other thread about @'s precedence and associativity, because he feels that if we end up giving @ and * different precedence, then that makes it much less clear what to do with @@, and reduces the strength of the */**/@/@@ analogy). In particular, if we want to argue for @@ then we'll need to figure out what expressions like a @@ b @@ c and a ** b @@ c and a @@ b ** c should do. A related question is what @@ should do if given an array as its right argument. In the current PEP, only integers are accepted, which rules out a bunch of the more complicated cases like a @@ b @@ c (at least assuming @@ is right-associative, like **, and I can't see why you'd want anything else). OTOH, in the brave new gufunc world, it technically would make sense to define @@ as being a gufunc with signature (m,m),()->(m,m), and the way gufuncs work this *would* allow the "power" to be an array -- for example, we'd have: mat = randn(m, m) pow = range(n) result = gufunc_matrix_power(mat, pow) assert result.shape == (n, m, m) for i in xrange(n): assert np.all(result[i, :, :] == mat ** i) In this case, a @@ b @@ c would at least be a meaningful expression to write. OTOH it would be incredibly bizarre and useless, so probably no-one would ever write it. As far as these technical issues go, my guess is that the correct rule is that @@ should just have the same precedence and the same (right) associativity as **, and in practice no-one will ever write stuff like a @@ b @@ c. But if we want to argue for @@ we need to come to some consensus or another here. It's also possible the answer is "ugh, these issues are too complicated, we should defer this until later when we have more experience with @ and gufuncs and stuff". After all, I doubt anyone else will swoop in and steal @@ to mean something else! OTOH, if e.g. there's a strong feeling that '@@ -1' will make a big difference in pedagogical contexts, then putting that off for years might be a mistake. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From jaime.frio at gmail.com Sat Mar 15 01:39:23 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Fri, 14 Mar 2014 22:39:23 -0700 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 9:32 PM, Nathaniel Smith wrote: > > Here are the interesting use cases for @@ that I can think of: > - 'vector @@ 2' gives the squared Euclidean length (because it's the > same as vector @ vector). Kind of handy. > - 'matrix @@ n' of course gives the matrix power, which is of marginal > use but does come in handy sometimes, e.g., when looking at graph > connectivity. > - 'matrix @@ -1' provides a very transparent notation for translating > textbook formulas (with all their inverses) into code. It's a bit > unhelpful in practice, because (a) usually you should use solve(), and > (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But > sometimes transparent notation may be important. (And in some cases, > like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be > compiled into a call to solve() anyway.) > > (Did I miss any?) > I'm not really arguing for it, and I am not sure how, or even if, it fits in the general scheme. But for completeness sake, 'e @@ Matrix' is used in some treatments of linear systems of differential equations, where: d/dt = @ would have solution = e @@ ( * t) @ I don't think it makes any sense to use it as such in the context of numpy, as I think it would make broadcasting undecidable. But there may be parallel universes where having n @@ and @@ n both with well defined, yet different meanings may make sense. It is my impression that in this entirely made up scenario you would want e @@ A @@ 3 to be evaluated as (e @@ A) @@ 3. Which probably has more to do with the fact that the two @@ mean different things, than with the associativity that repeated calls to the same @@ should have. Personally I couldn't care less, and if I had a vote I would let @@ rest for now, until we see how @ plays out. Jaime -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Mar 15 01:49:28 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Fri, 14 Mar 2014 22:49:28 -0700 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 9:15 PM, Chris Laumann wrote: > Hi all, > > Let me preface my two cents by saying that I think the best part of @ > being accepted is the potential for deprecating the matrix class -- the > syntactic beauty of infix for matrix multiply is a nice side effect IMHO :) > This may be why my basic attitude is: > > I don't think it matters very much but I would vote (weakly) for > weak-right. Where there is ambiguity, I suspect most practitioners will > just put in parentheses anyway -- especially with combinations of * and @, > where I don't think there is a natural intuitive precedence relationship. > At least, element-wise multiplication is very rare in math/physics texts as > an explicitly defined elementary operation so I'd be surprised if anybody > had a strong intuition about the precedence of the '*' operator. > My take on this is that if you mix * and @, you are probably using * to build the matrices you want to __matmul__ with @. So weak-right would be the way to go from that point of view. Jaime -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Sat Mar 15 06:20:07 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Sat, 15 Mar 2014 11:20:07 +0100 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: Hello. Maybe a solution would be to not see @ and @@ only from the matrix point of view. Why ? The philosophy of Python is to give total control of the infix operators +, * and ** for example via the magic methods. So it can be also the case for @ and @@ that could be use for something else that @@. So what we can expect from A@@B@@C. I will say that is the same as a**b**c because a human goes from top to down (but this is not a general convention in CAS). Ok guy but what can we do for @@@@. Just raises an error. The programmer has the possibility to use @@ as ** but it has to take care of the meaning regarding to the types of the objects. This is for example what we expect for @@pi even if we mathematically can give a meaning to that for some matrices. Do not forget also that a direct computation of the inverse of a matrice is a complicated things, and that integer power of matrices have to be cleverly build, but I'm sure that everyones here know that. *So standard Python can...* - only proposes multiplication of matrices, - and for the power of matrices, just indicates that there is a magic method associated to @@ and explains that regarding to the complexity of this problem, it will be the job of the programmer to implement it. I think the problem from Guido's point of view is the asymmetrical type domain for operations. All the numeric operators are from * to . Hoping that my frenchy english is clear enough. Chrisopthe BAL PS: maybe a good question for Python would be to see if other operators could be useful. For CAS, I would like to have the possibility to use f?g for composition, even if it is more for pedagogical reason, and f??n for dynamical systems. But this is just a dream... 2014-03-15 6:39 GMT+01:00 Jaime Fern?ndez del R?o : > On Fri, Mar 14, 2014 at 9:32 PM, Nathaniel Smith wrote: > >> >> Here are the interesting use cases for @@ that I can think of: >> - 'vector @@ 2' gives the squared Euclidean length (because it's the >> same as vector @ vector). Kind of handy. >> - 'matrix @@ n' of course gives the matrix power, which is of marginal >> use but does come in handy sometimes, e.g., when looking at graph >> connectivity. >> - 'matrix @@ -1' provides a very transparent notation for translating >> textbook formulas (with all their inverses) into code. It's a bit >> unhelpful in practice, because (a) usually you should use solve(), and >> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >> sometimes transparent notation may be important. (And in some cases, >> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >> compiled into a call to solve() anyway.) >> >> (Did I miss any?) >> > > I'm not really arguing for it, and I am not sure how, or even if, it fits > in the general scheme. But for completeness sake, 'e @@ Matrix' is used in > some treatments of linear systems of differential equations, where: > > d/dt = @ > > would have solution > > = e @@ ( * t) @ > > I don't think it makes any sense to use it as such in the context of > numpy, as I think it would make broadcasting undecidable. But there may be > parallel universes where having n @@ and @@ n both with > well defined, yet different meanings may make sense. It is my impression > that in this entirely made up scenario you would want e @@ A @@ 3 to be > evaluated as (e @@ A) @@ 3. Which probably has more to do with the fact > that the two @@ mean different things, than with the associativity that > repeated calls to the same @@ should have. > > Personally I couldn't care less, and if I had a vote I would let @@ rest > for now, until we see how @ plays out. > > Jaime > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 15 07:44:49 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2014 11:44:49 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I tend to favor tight-right. The general scheme of precedence more or less puts "heavier" operations higher than "lighter" operations (+ < * < **) and @ is "heavier" than * in my mind. I think tight (either -right or -left) has a good correspondence with current dot() expressions, so it will make translation a bit more straightforward, IMO. s * dot(A, b) == s * A @ b dot(s * A, b) == (s * A) @ b On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: > Hi all, > > Here's the main blocker for adding a matrix multiply operator '@' to Python: > we need to decide what we think its precedence and associativity should be. > I'll explain what that means so we're on the same page, and what the choices > are, and then we can all argue about it. But even better would be if we > could get some data to guide our decision, and this would be a lot easier if > some of you all can help; I'll suggest some ways you might be able to do > that. > > So! Precedence and left- versus right-associativity. If you already know > what these are you can skim down until you see CAPITAL LETTERS. > > We all know what precedence is. Code like this: > a + b * c > gets evaluated as: > a + (b * c) > because * has higher precedence than +. It "binds more tightly", as they > say. Python's complete precedence able is here: > http://docs.python.org/3/reference/expressions.html#operator-precedence > > Associativity, in the parsing sense, is less well known, though it's just as > important. It's about deciding how to evaluate code like this: > a * b * c > Do we use > a * (b * c) # * is "right associative" > or > (a * b) * c # * is "left associative" > ? Here all the operators have the same precedence (because, uh... they're > the same operator), so precedence doesn't help. And mostly we can ignore > this in day-to-day life, because both versions give the same answer, so who > cares. But a programming language has to pick one (consider what happens if > one of those objects has a non-default __mul__ implementation). And of > course it matters a lot for non-associative operations like > a - b - c > or > a / b / c > So when figuring out order of evaluations, what you do first is check the > precedence, and then if you have multiple operators next to each other with > the same precedence, you check their associativity. Notice that this means > that if you have different operators that share the same precedence level > (like + and -, or * and /), then they have to all have the same > associativity. All else being equal, it's generally considered nice to have > fewer precedence levels, because these have to be memorized by users. > > Right now in Python, every precedence level is left-associative, except for > '**'. If you write these formulas without any parentheses, then what the > interpreter will actually execute is: > (a * b) * c > (a - b) - c > (a / b) / c > but > a ** (b ** c) > > Okay, that's the background. Here's the question. We need to decide on > precedence and associativity for '@'. In particular, there are three > different options that are interesting: > > OPTION 1 FOR @: > Precedence: same as * > Associativity: left > My shorthand name for it: "same-left" (yes, very creative) > > This means that if you don't use parentheses, you get: > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> (a @ b) * c > > OPTION 2 FOR @: > Precedence: more-weakly-binding than * > Associativity: right > My shorthand name for it: "weak-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> (a * b) @ c > a @ b * c -> a @ (b * c) > > OPTION 3 FOR @: > Precedence: more-tightly-binding than * > Associativity: right > My shorthand name for it: "tight-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> a * (b @ c) > a @ b * c -> (a @ b) * c > > We need to pick which of which options we think is best, based on whatever > reasons we can think of, ideally more than "hmm, weak-right gives me warm > fuzzy feelings" ;-). (In principle the other 2 possible options are > tight-left and weak-left, but there doesn't seem to be any argument in favor > of either, so we'll leave them out of the discussion.) > > Some things to consider: > > * and @ are actually not associative (in the math sense) with respect to > each other, i.e., (a * b) @ c and a * (b @ c) in general give different > results when 'a' is not a scalar. So considering the two expressions 'a * b > @ c' and 'a @ b * c', we can see that each of these three options gives > produces different results in some cases. > > "Same-left" is the easiest to explain and remember, because it's just, "@ > acts like * and /". So we already have to know the rule in order to > understand other non-associative expressions like a / b / c or a - b - c, > and it'd be nice if the same rule applied to things like a * b @ c so we > only had to memorize *one* rule. (Of course there's ** which uses the > opposite rule, but I guess everyone internalized that one in secondary > school; that's not true for * versus @.) This is definitely the default we > should choose unless we have a good reason to do otherwise. > > BUT: there might indeed be a good reason to do otherwise, which is the whole > reason this has come up. Consider: > Mat1 @ Mat2 @ vec > Obviously this will execute much more quickly if we do > Mat1 @ (Mat2 @ vec) > because that results in two cheap matrix-vector multiplies, while > (Mat1 @ Mat2) @ vec > starts out by doing an expensive matrix-matrix multiply. So: maybe @ should > be right associative, so that we get the fast behaviour without having to > use explicit parentheses! /If/ these kinds of expressions are common enough > that having to remember to put explicit parentheses in all the time is more > of a programmer burden than having to memorize a special associativity rule > for @. Obviously Mat @ Mat @ vec is more common than vec @ Mat @ Mat, but > maybe they're both so rare that it doesn't matter in practice -- I don't > know. > > Also, if we do want @ to be right associative, then I can't think of any > clever reasons to prefer weak-right over tight-right, or vice-versa. For the > scalar multiplication case, I believe both options produce the same result > in the same amount of time. For the non-scalar case, they give different > answers. Do people have strong intuitions about what expressions like > a * b @ c > a @ b * c > should do actually? (I'm guessing not, but hey, you never know.) > > And, while intuition is useful, it would be really *really* nice to be > basing these decisions on more than *just* intuition, since whatever we > decide will be subtly influencing the experience of writing linear algebra > code in Python for the rest of time. So here's where I could use some help. > First, of course, if you have any other reasons why one or the other of > these options is better, then please share! But second, I think we need to > know something about how often the Mat @ Mat @ vec type cases arise in > practice. How often do non-scalar * and np.dot show up in the same > expression? How often does it look like a * np.dot(b, c), and how often does > it look like np.dot(a * b, c)? How often do we see expressions like > np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, > np.dot(b, c))? This would really help guide the debate. I don't have this > data, and I'm not sure the best way to get it. A super-fancy approach would > be to write a little script that uses the 'ast' module to count things > automatically. A less fancy approach would be to just pick some code you've > written, or a well-known package, grep through for calls to 'dot', and make > notes on what you see. (An advantage of the less-fancy approach is that as a > human you might be able to tell the difference between scalar and non-scalar > *, or check whether it actually matters what order the 'dot' calls are done > in.) > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Robert Kern From tmp50 at ukr.net Sat Mar 15 08:44:48 2014 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 15 Mar 2014 14:44:48 +0200 Subject: [Numpy-discussion] [ANN] OpenOpt Suite release 0.53: Stochastic programming addon now is BSD-licensed Message-ID: <1394887377.137996545.15dx925h@frv44.fwdcdn.com> hi all, I'm glad to inform you about new OpenOpt Suite release 0.53: ? ? Stochastic programming addon now is available for free ? ? Some minor changes -------------------------------------------------- Regards, D. http://openopt.org/Dmitrey -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Sat Mar 15 09:13:06 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Sat, 15 Mar 2014 09:13:06 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: <532451E2.5000205@gmail.com> On 3/15/2014 12:32 AM, Nathaniel Smith wrote: > I know you were worried > about losing the .I attribute on matrices if switching to ndarrays for > teaching -- given that ndarray will probably not get a .I attribute, > how much would the existence of @@ -1 affect you? Not much. Positive integer powers would be useful (for illustrating e.g. graph theory and difference equations), but not enough to delay the PEP. I think NumPy should "take the money and run". Getting `@` is great. Let's get experience with it before deciding whether it's worth asking for `@@`. Questions for `@@`: - would it just be `matrix_power`, with all the restrictions? - or would `a(10,2,2)@@-1` return an array of matrix inverses? - etc In the end, I'd like to see a functional implementation before deciding on `@@`, but I would not like to see `@` delayed at all. Congratulations, Alan From charlesr.harris at gmail.com Sat Mar 15 10:49:26 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 08:49:26 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I favor the weak right option. 1) Giving '*' higher precedence than `@` makes it easier, to my mind, to parse out what is going to happen: all the element-wise multiplications, followed by the matrix operations. I'd probably still use parenthesis for clarity. 2) Right associative has the advantage of efficiency in many common use cases, plus I tend to read matrix expressions from right to left. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 10:52:53 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 08:52:53 -0600 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 6:51 PM, Nathaniel Smith wrote: > Well, that was fast. Guido says he'll accept the addition of '@' as an > infix operator for matrix multiplication, once some details are ironed > out: > https://mail.python.org/pipermail/python-ideas/2014-March/027109.html > http://legacy.python.org/dev/peps/pep-0465/ > > Specifically, we need to figure out whether we want to make an > argument for a matrix power operator ("@@"), and what > precedence/associativity we want '@' to have. I'll post two separate > threads to get feedback on those in an organized way -- this is just a > heads-up. > > Surprisingly little discussion on python-ideas, or so it seemed to me. Guido came out in favor less than halfway through. Congratulations on putting together a successful proposal, many of us had given up on ever seeing a matrix multiplication operator. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 11:18:40 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 09:18:40 -0600 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: > Hi all, > > Here's the second thread for discussion about Guido's concerns about > PEP 465. The issue here is that PEP 465 as currently written proposes > two new operators, @ for matrix multiplication and @@ for matrix power > (analogous to * and **): > http://legacy.python.org/dev/peps/pep-0465/ > > The main thing we care about of course is @; I pushed for including @@ > because I thought it was nicer to have than not, and I thought the > analogy between * and ** might make the overall package more appealing > to Guido's aesthetic sense. > > It turns out I was wrong :-). Guido is -0 on @@, but willing to be > swayed if we think it's worth the trouble to make a solid case. > > Note that question now is *not*, how will @@ affect the reception of > @. @ itself is AFAICT a done deal, regardless of what happens with @@. > For this discussion let's assume @ can be taken for granted, and that > we can freely choose to either add @@ or not add @@ to the language. > The question is: which do we think makes Python a better language (for > us and in general)? > > Some thoughts to start us off: > > Here are the interesting use cases for @@ that I can think of: > - 'vector @@ 2' gives the squared Euclidean length (because it's the > same as vector @ vector). Kind of handy. > - 'matrix @@ n' of course gives the matrix power, which is of marginal > use but does come in handy sometimes, e.g., when looking at graph > connectivity. > - 'matrix @@ -1' provides a very transparent notation for translating > textbook formulas (with all their inverses) into code. It's a bit > unhelpful in practice, because (a) usually you should use solve(), and > (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But > sometimes transparent notation may be important. (And in some cases, > like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be > compiled into a call to solve() anyway.) > > (Did I miss any?) > > In practice it seems to me that the last use case is the one that's > might matter a lot practice, but then again, it might not -- I'm not > sure. For example, does anyone who teaches programming with numpy have > a feeling about whether the existence of '@@ -1' would make a big > difference to you and your students? (Alan? I know you were worried > about losing the .I attribute on matrices if switching to ndarrays for > teaching -- given that ndarray will probably not get a .I attribute, > how much would the existence of @@ -1 affect you?) > > On a more technical level, Guido is worried about how @@'s precedence > should work (and this is somewhat related to the other thread about > @'s precedence and associativity, because he feels that if we end up > giving @ and * different precedence, then that makes it much less > clear what to do with @@, and reduces the strength of the */**/@/@@ > analogy). In particular, if we want to argue for @@ then we'll need to > figure out what expressions like > a @@ b @@ c > and > a ** b @@ c > and > a @@ b ** c > should do. > > A related question is what @@ should do if given an array as its right > argument. In the current PEP, only integers are accepted, which rules > out a bunch of the more complicated cases like a @@ b @@ c (at least > assuming @@ is right-associative, like **, and I can't see why you'd > want anything else). OTOH, in the brave new gufunc world, it > technically would make sense to define @@ as being a gufunc with > signature (m,m),()->(m,m), and the way gufuncs work this *would* allow > the "power" to be an array -- for example, we'd have: > > mat = randn(m, m) > pow = range(n) > result = gufunc_matrix_power(mat, pow) > assert result.shape == (n, m, m) > for i in xrange(n): > assert np.all(result[i, :, :] == mat ** i) > > In this case, a @@ b @@ c would at least be a meaningful expression to > write. OTOH it would be incredibly bizarre and useless, so probably > no-one would ever write it. > > As far as these technical issues go, my guess is that the correct rule > is that @@ should just have the same precedence and the same (right) > associativity as **, and in practice no-one will ever write stuff like > a @@ b @@ c. But if we want to argue for @@ we need to come to some > consensus or another here. > > It's also possible the answer is "ugh, these issues are too > complicated, we should defer this until later when we have more > experience with @ and gufuncs and stuff". After all, I doubt anyone > else will swoop in and steal @@ to mean something else! OTOH, if e.g. > there's a strong feeling that '@@ -1' will make a big difference in > pedagogical contexts, then putting that off for years might be a > mistake. > > I don't have a strong feeling either way on '@@' . Matrix inverses are pretty common in matrix expressions, but I don't know that the new operator offers much advantage over a function call. The positive integer powers might be useful in some domains, as others have pointed out, but computational practice one would tend to factor the evaluation. Chuck -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 15 11:58:32 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2014 15:58:32 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 2:49 PM, Charles R Harris wrote: > > I favor the weak right option. > > 1) Giving '*' higher precedence than `@` makes it easier, to my mind, to > parse out what is going to happen: all the element-wise multiplications, > followed by the matrix operations. I'd probably still use parenthesis for > clarity. It seems to me that 'tight' gives the same benefit. Any reasoning for the preference of 'weak' over 'tight'? -- Robert Kern From shish at keba.be Sat Mar 15 12:03:42 2014 From: shish at keba.be (Olivier Delalleau) Date: Sat, 15 Mar 2014 12:03:42 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: 2014-03-15 11:18 GMT-04:00 Charles R Harris : > > > > On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: > >> Hi all, >> >> Here's the second thread for discussion about Guido's concerns about >> PEP 465. The issue here is that PEP 465 as currently written proposes >> two new operators, @ for matrix multiplication and @@ for matrix power >> (analogous to * and **): >> http://legacy.python.org/dev/peps/pep-0465/ >> >> The main thing we care about of course is @; I pushed for including @@ >> because I thought it was nicer to have than not, and I thought the >> analogy between * and ** might make the overall package more appealing >> to Guido's aesthetic sense. >> >> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >> swayed if we think it's worth the trouble to make a solid case. >> >> Note that question now is *not*, how will @@ affect the reception of >> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >> For this discussion let's assume @ can be taken for granted, and that >> we can freely choose to either add @@ or not add @@ to the language. >> The question is: which do we think makes Python a better language (for >> us and in general)? >> >> Some thoughts to start us off: >> >> Here are the interesting use cases for @@ that I can think of: >> - 'vector @@ 2' gives the squared Euclidean length (because it's the >> same as vector @ vector). Kind of handy. >> - 'matrix @@ n' of course gives the matrix power, which is of marginal >> use but does come in handy sometimes, e.g., when looking at graph >> connectivity. >> - 'matrix @@ -1' provides a very transparent notation for translating >> textbook formulas (with all their inverses) into code. It's a bit >> unhelpful in practice, because (a) usually you should use solve(), and >> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >> sometimes transparent notation may be important. (And in some cases, >> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >> compiled into a call to solve() anyway.) >> >> (Did I miss any?) >> >> In practice it seems to me that the last use case is the one that's >> might matter a lot practice, but then again, it might not -- I'm not >> sure. For example, does anyone who teaches programming with numpy have >> a feeling about whether the existence of '@@ -1' would make a big >> difference to you and your students? (Alan? I know you were worried >> about losing the .I attribute on matrices if switching to ndarrays for >> teaching -- given that ndarray will probably not get a .I attribute, >> how much would the existence of @@ -1 affect you?) >> >> On a more technical level, Guido is worried about how @@'s precedence >> should work (and this is somewhat related to the other thread about >> @'s precedence and associativity, because he feels that if we end up >> giving @ and * different precedence, then that makes it much less >> clear what to do with @@, and reduces the strength of the */**/@/@@ >> analogy). In particular, if we want to argue for @@ then we'll need to >> figure out what expressions like >> a @@ b @@ c >> and >> a ** b @@ c >> and >> a @@ b ** c >> should do. >> >> A related question is what @@ should do if given an array as its right >> argument. In the current PEP, only integers are accepted, which rules >> out a bunch of the more complicated cases like a @@ b @@ c (at least >> assuming @@ is right-associative, like **, and I can't see why you'd >> want anything else). OTOH, in the brave new gufunc world, it >> technically would make sense to define @@ as being a gufunc with >> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >> the "power" to be an array -- for example, we'd have: >> >> mat = randn(m, m) >> pow = range(n) >> result = gufunc_matrix_power(mat, pow) >> assert result.shape == (n, m, m) >> for i in xrange(n): >> assert np.all(result[i, :, :] == mat ** i) >> >> In this case, a @@ b @@ c would at least be a meaningful expression to >> write. OTOH it would be incredibly bizarre and useless, so probably >> no-one would ever write it. >> >> As far as these technical issues go, my guess is that the correct rule >> is that @@ should just have the same precedence and the same (right) >> associativity as **, and in practice no-one will ever write stuff like >> a @@ b @@ c. But if we want to argue for @@ we need to come to some >> consensus or another here. >> >> It's also possible the answer is "ugh, these issues are too >> complicated, we should defer this until later when we have more >> experience with @ and gufuncs and stuff". After all, I doubt anyone >> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >> there's a strong feeling that '@@ -1' will make a big difference in >> pedagogical contexts, then putting that off for years might be a >> mistake. >> >> > I don't have a strong feeling either way on '@@' . Matrix inverses are > pretty common in matrix expressions, but I don't know that the new operator > offers much advantage over a function call. The positive integer powers > might be useful in some domains, as others have pointed out, but > computational practice one would tend to factor the evaluation. > > Chuck > Personally I think it should go in, because: - it's useful (although marginally), as in the examples previously mentioned - it's what people will expect - it's the only reasonable use of @@ once @ makes it in As far as the details about precedence rules and what not... Yes, someone should think about them and come up with rules that make sense, but since it will be pretty much only be used in unambiguous situations, this shouldn't be a blocker. -=- Olivier -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 15 12:13:44 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2014 16:13:44 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 11:44 AM, Robert Kern wrote: > I tend to favor tight-right. The general scheme of precedence more or > less puts "heavier" operations higher than "lighter" operations (+ < * > < **) and @ is "heavier" than * in my mind. I think tight (either > -right or -left) has a good correspondence with current dot() > expressions, so it will make translation a bit more straightforward, > IMO. > > s * dot(A, b) == s * A @ b > dot(s * A, b) == (s * A) @ b I'm not sure if this is a convincing argument, but I'll throw it out there: in most of my programming fonts, @ is a bigger, more visually salient character than *, so in any expression that mixes the two, my visual system is going to end up grouping the @-connected terms anyways. -- Robert Kern From lmao20001 at gmail.com Sat Mar 15 12:27:42 2014 From: lmao20001 at gmail.com (Leo Mao) Date: Sun, 16 Mar 2014 00:27:42 +0800 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: <97A1D9A3-BBB5-4D2B-89F5-5C5318747591@gmail.com> <92158750-F4CC-460B-9953-AFEDA867A6FE@gmail.com> <53235BAB.5090500@googlemail.com> Message-ID: Because of the license problem, I think I will choose Yeppp as a default backend. And if time allows, maybe I can implement other bindings. (Vc library) Also I found that sleef library is in public domain. But it seems that it only provides fast math function, not "vectorized math function". So I am not sure if it can be used in this project. Finally, if there are any suggestions for my proposal, please point out. I will appreciate your suggestions. Thanks. Regards, Leo Mao -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 12:40:00 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 10:40:00 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 9:58 AM, Robert Kern wrote: > On Sat, Mar 15, 2014 at 2:49 PM, Charles R Harris > wrote: > > > > I favor the weak right option. > > > > 1) Giving '*' higher precedence than `@` makes it easier, to my mind, to > > parse out what is going to happen: all the element-wise multiplications, > > followed by the matrix operations. I'd probably still use parenthesis for > > clarity. > > It seems to me that 'tight' gives the same benefit. Any reasoning for > the preference of 'weak' over 'tight'? > > Two other reasons come to mind. First, '*' is right associative, so I think it is nicer to first view the expression as parsed into blocks separated by '@', which act somewhat like parenthesis at that point, and then evaluate the blocks. Second, and somewhat weaker, it might make it easier to track how arrays are broadcast. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 12:48:08 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 10:48:08 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Oops, make that '*' is *left* associative. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 15 12:49:55 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 15 Mar 2014 16:49:55 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 4:40 PM, Charles R Harris wrote: > > On Sat, Mar 15, 2014 at 9:58 AM, Robert Kern wrote: >> >> On Sat, Mar 15, 2014 at 2:49 PM, Charles R Harris >> wrote: >> > >> > I favor the weak right option. >> > >> > 1) Giving '*' higher precedence than `@` makes it easier, to my mind, to >> > parse out what is going to happen: all the element-wise multiplications, >> > followed by the matrix operations. I'd probably still use parenthesis >> > for >> > clarity. >> >> It seems to me that 'tight' gives the same benefit. Any reasoning for >> the preference of 'weak' over 'tight'? >> > > Two other reasons come to mind. First, '*' is right associative, so I think > it is nicer to first view the expression as parsed into blocks separated by > '@', which act somewhat like parenthesis at that point, and then evaluate > the blocks. Again, I think tight does the same amount of separation, just with blocks of matrix multiplication broken up by elementwise multiplication; this is just an argument against 'same', not for 'weak' over 'tight'. As I mentioned elsewhere, my visual system seems to break things up with @-tight anyways. Does it not for you? > Second, and somewhat weaker, it might make it easier to track > how arrays are broadcast. I think this point is going to ultimately determine this, if we can break down all the cases and if they actually do favor one choice over another. -- Robert Kern From ndarray at mac.com Sat Mar 15 14:25:03 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 15 Mar 2014 14:25:03 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: > Here's the main blocker for adding a matrix multiply operator '@' to > Python: we need to decide what we think its precedence and associativity > should be. I am not ready to form my own opinion, but I hope the following will help shaping the discussion. Currently, [1], Python operator precedence is +, -Addition and subtraction*, /, //, %Multiplication, division, remainder [5] +x, -x, ~xPositive, negative, bitwise NOT**Exponentiation [6] x[index], x[index:index], x(arguments...), x.attributeSubscription, slicing, call, attribute reference We need to decide whether @ belongs to one of the existing row or deserves one of its own. The associativity debate is one of those debates [2] where there is no right answer. Guido has very wisely left it for the numeric community to decide. I would start with surveying the prior art of using right associativity and the reasons it was chosen and see if those reasons apply. (An example of a choice made for wrong reasons is our decimal system. We write our numbers backwards - from high to low place value - only because we took them from people who write text from right to left. As a result, computer parsers have to skip to the last or count the number of digits before they can start evaluating the number.) Here is the start: 1. APL uses right to left associativity for all operators and all operators have the same precedence. 2. Exponentiation operator is right associative in most languages with MATLAB being a notable exception. [1] http://docs.python.org/3/reference/expressions.html#evaluation-order [2] http://en.wikipedia.org/wiki/Lilliput_and_Blefuscu [3] http://www.tcl.tk/cgi-bin/tct/tip/274.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 15 14:28:51 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 18:28:51 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: > Hi all, > > Here's the main blocker for adding a matrix multiply operator '@' to Python: > we need to decide what we think its precedence and associativity should be. Another data point that might be useful: Matlab: same-left R: tight-left IDL: same-left GAUSS: same-left (IIUC -- any GAUSS experts please correct me if I misunderstood the fine manual) Mathematica: instead of having an associativity, a @ b @ c gets converted into mdot([a, b, c]) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From joferkington at gmail.com Sat Mar 15 14:33:44 2014 From: joferkington at gmail.com (Joe Kington) Date: Sat, 15 Mar 2014 13:33:44 -0500 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 1:28 PM, Nathaniel Smith wrote: > On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: > > Hi all, > > > > Here's the main blocker for adding a matrix multiply operator '@' to > Python: > > we need to decide what we think its precedence and associativity should > be. > > Another data point that might be useful: > > Matlab: same-left > > R: tight-left > I was going to ask this earlier, but I was worried I was missing something major. Why was "tight-left" not an option? This means that if you don't use parentheses, you get: a @ b @ c -> (a @ b) @ c a * b @ c -> a * (b @ c) a @ b * c -> (a @ b) * c In my (very inexperienced) opinion, it seems like the most intuitive option. Cheers, -Joe > IDL: same-left > > GAUSS: same-left (IIUC -- any GAUSS experts please correct me if I > misunderstood the fine manual) > > Mathematica: instead of having an associativity, a @ b @ c gets > converted into mdot([a, b, c]) > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 15 14:34:22 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 18:34:22 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Hi Chris, On Sat, Mar 15, 2014 at 4:15 AM, Chris Laumann wrote: > Hi all, > > Let me preface my two cents by saying that I think the best part of @ being > accepted is the potential for deprecating the matrix class ? the syntactic > beauty of infix for matrix multiply is a nice side effect IMHO :) This may > be why my basic attitude is: > > I don?t think it matters very much but I would vote (weakly) for weak-right. > Where there is ambiguity, I suspect most practitioners will just put in > parentheses anyway ? especially with combinations of * and @, where I don?t > think there is a natural intuitive precedence relationship. At least, > element-wise multiplication is very rare in math/physics texts as an > explicitly defined elementary operation so I?d be surprised if anybody had a > strong intuition about the precedence of the ?*? operator. And the binding > order doesn?t matter if it is scalar multiplication. "It doesn't matter" and "no-one has strong intuitions" are generally arguments for same-left, since that allows everyone to reason about @ in the same way they reason about all of Python's operators. > I have quite a bit of code with large matrices where the order of > matrix-vector multiplies is an important optimization and I would certainly > have a few simpler looking expressions for op @ op @ vec, hence the weak > preference for right-associativity. That said, I routinely come across > situations where the optimal matrix multiplication order is more complicated > than can be expressed as left-right or right-left (because some matrices > might be diagonal, CSR or CSC), which is why the preference is only weak. I > don?t see a down-side in the use-case that it is actually associative (as in > matrix-matrix-vector). Would you mind taking a more systematic look through this code, or sharing some examples so the rest of us can look? "Certainly have a few simpler looking expressions" is a good start, but when we're talking about changing the grammar of one of the most popular programming languages in the world it seems worth the effort to gather some more careful data :-). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Sat Mar 15 14:40:40 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 18:40:40 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 6:33 PM, Joe Kington wrote: > On Sat, Mar 15, 2014 at 1:28 PM, Nathaniel Smith wrote: >> >> On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: >> > Hi all, >> > >> > Here's the main blocker for adding a matrix multiply operator '@' to >> > Python: >> > we need to decide what we think its precedence and associativity should >> > be. >> >> Another data point that might be useful: >> >> Matlab: same-left >> >> >> R: tight-left > > > > I was going to ask this earlier, but I was worried I was missing something > major. > > Why was "tight-left" not an option? > > > This means that if you don't use parentheses, you get: > a @ b @ c -> (a @ b) @ c > a * b @ c -> a * (b @ c) > a @ b * c -> (a @ b) * c > > > In my (very inexperienced) opinion, it seems like the most intuitive option. Because tight-left doesn't seem to have much to recommend it over same-left, and all else being equal having fewer levels of precedence is usually considered a good thing. Unless I'm missing something. If we do decide that tight-left is best then we could certainly advocate for it. I wouldn't read too much into R's choice; they don't actually define a separate precedence level for matrix multiplication specifically. They have a single precedence level for all "special" (user-defined) operators, and matrix multiplication happens to be one of these. (Their versions of // and % are also "special", but I don't think anyone would expect // to bind more tightly than / if one were choosing precedences on a case-by-case basis.) -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndarray at mac.com Sat Mar 15 15:01:48 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 15 Mar 2014 15:01:48 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 2:25 PM, Alexander Belopolsky wrote: > On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: > >> Here's the main blocker for adding a matrix multiply operator '@' to >> Python: we need to decide what we think its precedence and associativity >> should be. > > > I am not ready to form my own opinion, but I hope the following will help > shaping the discussion. One more question that I think should be answered by the PEP and may influence the associativity decision is what happens if in an A @ B @ C expression, each operand has its own type that defines __matmul__ and __rmatmul__? For example, A can be an ndarray, B a sympy expression and C a pyoperator. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 15:02:26 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 13:02:26 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 12:40 PM, Nathaniel Smith wrote: > On Sat, Mar 15, 2014 at 6:33 PM, Joe Kington > wrote: > > On Sat, Mar 15, 2014 at 1:28 PM, Nathaniel Smith wrote: > >> > >> On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: > >> > Hi all, > >> > > >> > Here's the main blocker for adding a matrix multiply operator '@' to > >> > Python: > >> > we need to decide what we think its precedence and associativity > should > >> > be. > >> > >> Another data point that might be useful: > >> > >> Matlab: same-left > >> > >> > >> R: tight-left > > > > > > > > I was going to ask this earlier, but I was worried I was missing > something > > major. > > > > Why was "tight-left" not an option? > > > > > > This means that if you don't use parentheses, you get: > > a @ b @ c -> (a @ b) @ c > > a * b @ c -> a * (b @ c) > > a @ b * c -> (a @ b) * c > > > > > > In my (very inexperienced) opinion, it seems like the most intuitive > option. > > Because tight-left doesn't seem to have much to recommend it over > same-left, and all else being equal having fewer levels of precedence > is usually considered a good thing. Unless I'm missing something. If > we do decide that tight-left is best then we could certainly advocate > for it. > > I wouldn't read too much into R's choice; they don't actually define a > separate precedence level for matrix multiplication specifically. They > have a single precedence level for all "special" (user-defined) > operators, and matrix multiplication happens to be one of these. > (Their versions of // and % are also "special", but I don't think > anyone would expect // to bind more tightly than / if one were > choosing precedences on a case-by-case basis.) > > Just to throw something new into the mix u at v@w = u@(v at w) -- u at v is a dyadic matrix u at v -- is a scalar It would be nice if u at v@None, or some such, would evaluate as a dyad. Or else we will still need the concept of row and column 1-D matrices. I still think v.T should set a flag so that one can distinguish u at v.T (dyad) from u.T at v (inner product), where 1-D arrays are normally treated as column vectors. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 15:16:38 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 13:16:38 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 1:01 PM, Alexander Belopolsky wrote: > > On Sat, Mar 15, 2014 at 2:25 PM, Alexander Belopolsky wrote: > >> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: >> >>> Here's the main blocker for adding a matrix multiply operator '@' to >>> Python: we need to decide what we think its precedence and associativity >>> should be. >> >> >> I am not ready to form my own opinion, but I hope the following will help >> shaping the discussion. > > > One more question that I think should be answered by the PEP and may > influence the associativity decision is what happens if in an A @ B @ C > expression, each operand has its own type that defines __matmul__ and > __rmatmul__? For example, A can be an ndarray, B a sympy expression and C > a pyoperator. > My impression is that the pyoperator folks would prefer right associativity as it corresponds to function composition, which also proceeds right to left. I don't think the sympy folks have expressed and opinion, except perhaps that they are more in the sage camp where matrices are symbolic, and not to be confused with arrays. That is, they don't depend on having two operators, one for the Hadamard product and another for the matrix product. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 15 15:29:50 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 15 Mar 2014 19:29:50 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On 15 Mar 2014 19:02, "Charles R Harris" wrote: > Just to throw something new into the mix > > u at v@w = u@(v at w) -- u at v is a dyadic matrix > > u at v -- is a scalar > > It would be nice if u at v@None, or some such, would evaluate as a dyad. Or else we will still need the concept of row and column 1-D matrices. I still think v.T should set a flag so that one can distinguish u at v.T (dyad) from u.T at v (inner product), where 1-D arrays are normally treated as column vectors. This sounds important but I have no idea what any of it means :-) (What's a dyadic matrix?) Can you elaborate? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Sat Mar 15 15:58:51 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 15 Mar 2014 15:58:51 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 3:29 PM, Nathaniel Smith wrote: > > It would be nice if u at v@None, or some such, would evaluate as a dyad. > Or else we will still need the concept of row and column 1-D matrices. I > still think v.T should set a flag so that one can distinguish u at v.T(dyad) from u.T at v(inner product), where 1-D arrays are normally treated as column vectors. > > This sounds important but I have no idea what any of it means :-) (What's > a dyadic matrix?) Can you elaborate? > I assume dyadic means 2d. This discussion gave me an idea that is only tangentially relevant to the discussion at hand. It looks like numpy operators commonly need to make a choice whether to treat an Nd array as a unit (atom) or as a list to broadcast itself over. APL-derived languages solve this problem by using operator modifiers. Applied to our case, given a dot-product operator @, each[@] operator works on 2d arrays by "dotting" them pair-wise and returning a 1d array. Similarly, eachleft[@] would operate on 2d, 1d operands by broadcasting itself over the left operand (incidentally reproducing the mat @ vec behavior) and eachright[@] would treat its left operand atomically and broadcast over the right operand. My idea is inspired by Guido's "use facade" suggestion. We can define ndarray.each(axes=(0,)) method that would return a light-weigh proxy object so that a each[@] b is spelled a.each() @ b.each() a eachleft[@] b is spelled a.each() @ b a eachright[@] b is spelled a @ b.each() -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 16:00:14 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 14:00:14 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 1:29 PM, Nathaniel Smith wrote: > On 15 Mar 2014 19:02, "Charles R Harris" > wrote: > > Just to throw something new into the mix > > > > u at v@w = u@(v at w) -- u at v is a dyadic matrix > > > > u at v -- is a scalar > > > > It would be nice if u at v@None, or some such, would evaluate as a dyad. > Or else we will still need the concept of row and column 1-D matrices. I > still think v.T should set a flag so that one can distinguish u at v.T(dyad) from u.T at v(inner product), where 1-D arrays are normally treated as column vectors. > > This sounds important but I have no idea what any of it means :-) (What's > a dyadic matrix?) Can you elaborate? > Dyadic matrices date back to the beginning of vector calculus and J. W. Gibbs. These days they are usually written as v*w.T, i.e., the outer product of two vectors and are a fairly common occurrence in matrix expressions. For instance, covariance matrices are defined as E(v * v.T) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Sat Mar 15 16:12:18 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 15 Mar 2014 16:12:18 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 4:00 PM, Charles R Harris wrote: > These days they are usually written as v*w.T, i.e., the outer product of > two vectors and are a fairly common occurrence in matrix expressions. For > instance, covariance matrices are defined as E(v * v.T) With the current numpy, we can do >>> x = arange(1, 5) >>> x[:,None].dot(x[None,:]) array([[ 1, 2, 3, 4], [ 2, 4, 6, 8], [ 3, 6, 9, 12], [ 4, 8, 12, 16]]) I assume once @ becomes available, we will have >>> x[:,None] @ x[None,:] array([[ 1, 2, 3, 4], [ 2, 4, 6, 8], [ 3, 6, 9, 12], [ 4, 8, 12, 16]]) -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 15 16:52:57 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 14:52:57 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 2:12 PM, Alexander Belopolsky wrote: > > On Sat, Mar 15, 2014 at 4:00 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> These days they are usually written as v*w.T, i.e., the outer product of >> two vectors and are a fairly common occurrence in matrix expressions. For >> instance, covariance matrices are defined as E(v * v.T) > > > With the current numpy, we can do > > >>> x = arange(1, 5) > >>> x[:,None].dot(x[None,:]) > array([[ 1, 2, 3, 4], > [ 2, 4, 6, 8], > [ 3, 6, 9, 12], > [ 4, 8, 12, 16]]) > > I assume once @ becomes available, we will have > > >>> x[:,None] @ x[None,:] > array([[ 1, 2, 3, 4], > [ 2, 4, 6, 8], > [ 3, 6, 9, 12], > [ 4, 8, 12, 16]]) > Yes, that works. I was thinking more of easy translation of the forms found in textbooks. Householder reflection, for instance, is usually written as I - 2 * v * v.T Where the `v` are unit vectors. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sat Mar 15 17:11:17 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 15 Mar 2014 14:11:17 -0700 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: Speaking only for myself (and as someone who has regularly used matrix powers), I would not expect matrix power as @@ to follow from matrix multiplication as @. I do agree that matrix power is the only reasonable use for @@ (given @), but it's still not something I would be confident enough to know without looking up. We should keep in mind that each new operator imposes some (small) cognitive burden on everyone who encounters them for the first time, and, in this case, this will include a large fraction of all Python users, whether they do numerical computation or not. Guido has given us a tremendous gift in the form of @. Let's not insist on @@, when it is unclear if the burden of figuring out what @@ means it would be worth using, even for heavily numeric code. I would certainly prefer to encounter norm(A), inv(A), matrix_power(A, n), fractional_matrix_power(A, n) and expm(A) rather than their infix equivalents. It will certainly not be obvious which of these @@ will support for objects from any given library. One useful data point might be to consider whether matrix power is available as an infix operator in other languages commonly used for numerical work. AFAICT from some quick searches: MATLAB: Yes R: No IDL: No All of these languages do, of course, implement infix matrix multiplication, but it is apparently not clear at all whether the matrix power is useful. Best, Stephan On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau wrote: > 2014-03-15 11:18 GMT-04:00 Charles R Harris : > > >> >> >> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: >> >>> Hi all, >>> >>> Here's the second thread for discussion about Guido's concerns about >>> PEP 465. The issue here is that PEP 465 as currently written proposes >>> two new operators, @ for matrix multiplication and @@ for matrix power >>> (analogous to * and **): >>> http://legacy.python.org/dev/peps/pep-0465/ >>> >>> The main thing we care about of course is @; I pushed for including @@ >>> because I thought it was nicer to have than not, and I thought the >>> analogy between * and ** might make the overall package more appealing >>> to Guido's aesthetic sense. >>> >>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >>> swayed if we think it's worth the trouble to make a solid case. >>> >>> Note that question now is *not*, how will @@ affect the reception of >>> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >>> For this discussion let's assume @ can be taken for granted, and that >>> we can freely choose to either add @@ or not add @@ to the language. >>> The question is: which do we think makes Python a better language (for >>> us and in general)? >>> >>> Some thoughts to start us off: >>> >>> Here are the interesting use cases for @@ that I can think of: >>> - 'vector @@ 2' gives the squared Euclidean length (because it's the >>> same as vector @ vector). Kind of handy. >>> - 'matrix @@ n' of course gives the matrix power, which is of marginal >>> use but does come in handy sometimes, e.g., when looking at graph >>> connectivity. >>> - 'matrix @@ -1' provides a very transparent notation for translating >>> textbook formulas (with all their inverses) into code. It's a bit >>> unhelpful in practice, because (a) usually you should use solve(), and >>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >>> sometimes transparent notation may be important. (And in some cases, >>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >>> compiled into a call to solve() anyway.) >>> >>> (Did I miss any?) >>> >>> In practice it seems to me that the last use case is the one that's >>> might matter a lot practice, but then again, it might not -- I'm not >>> sure. For example, does anyone who teaches programming with numpy have >>> a feeling about whether the existence of '@@ -1' would make a big >>> difference to you and your students? (Alan? I know you were worried >>> about losing the .I attribute on matrices if switching to ndarrays for >>> teaching -- given that ndarray will probably not get a .I attribute, >>> how much would the existence of @@ -1 affect you?) >>> >>> On a more technical level, Guido is worried about how @@'s precedence >>> should work (and this is somewhat related to the other thread about >>> @'s precedence and associativity, because he feels that if we end up >>> giving @ and * different precedence, then that makes it much less >>> clear what to do with @@, and reduces the strength of the */**/@/@@ >>> analogy). In particular, if we want to argue for @@ then we'll need to >>> figure out what expressions like >>> a @@ b @@ c >>> and >>> a ** b @@ c >>> and >>> a @@ b ** c >>> should do. >>> >>> A related question is what @@ should do if given an array as its right >>> argument. In the current PEP, only integers are accepted, which rules >>> out a bunch of the more complicated cases like a @@ b @@ c (at least >>> assuming @@ is right-associative, like **, and I can't see why you'd >>> want anything else). OTOH, in the brave new gufunc world, it >>> technically would make sense to define @@ as being a gufunc with >>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >>> the "power" to be an array -- for example, we'd have: >>> >>> mat = randn(m, m) >>> pow = range(n) >>> result = gufunc_matrix_power(mat, pow) >>> assert result.shape == (n, m, m) >>> for i in xrange(n): >>> assert np.all(result[i, :, :] == mat ** i) >>> >>> In this case, a @@ b @@ c would at least be a meaningful expression to >>> write. OTOH it would be incredibly bizarre and useless, so probably >>> no-one would ever write it. >>> >>> As far as these technical issues go, my guess is that the correct rule >>> is that @@ should just have the same precedence and the same (right) >>> associativity as **, and in practice no-one will ever write stuff like >>> a @@ b @@ c. But if we want to argue for @@ we need to come to some >>> consensus or another here. >>> >>> It's also possible the answer is "ugh, these issues are too >>> complicated, we should defer this until later when we have more >>> experience with @ and gufuncs and stuff". After all, I doubt anyone >>> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >>> there's a strong feeling that '@@ -1' will make a big difference in >>> pedagogical contexts, then putting that off for years might be a >>> mistake. >>> >>> >> I don't have a strong feeling either way on '@@' . Matrix inverses are >> pretty common in matrix expressions, but I don't know that the new operator >> offers much advantage over a function call. The positive integer powers >> might be useful in some domains, as others have pointed out, but >> computational practice one would tend to factor the evaluation. >> >> Chuck >> > > Personally I think it should go in, because: > - it's useful (although marginally), as in the examples previously > mentioned > - it's what people will expect > - it's the only reasonable use of @@ once @ makes it in > > As far as the details about precedence rules and what not... Yes, someone > should think about them and come up with rules that make sense, but since > it will be pretty much only be used in unambiguous situations, this > shouldn't be a blocker. > > -=- Olivier > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 15 20:38:14 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 15 Mar 2014 20:38:14 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: I think I wouldn't use anything like @@ often enough to remember it's meaning. I'd rather see english names for anything that is not **very** common. I find A@@-1 pretty ugly compared to inv(A) A@@(-0.5) might be nice (do we have matrix_sqrt ?) Josef On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer wrote: > Speaking only for myself (and as someone who has regularly used matrix > powers), I would not expect matrix power as @@ to follow from matrix > multiplication as @. I do agree that matrix power is the only reasonable > use for @@ (given @), but it's still not something I would be confident > enough to know without looking up. > > We should keep in mind that each new operator imposes some (small) > cognitive burden on everyone who encounters them for the first time, and, > in this case, this will include a large fraction of all Python users, > whether they do numerical computation or not. > > Guido has given us a tremendous gift in the form of @. Let's not insist on > @@, when it is unclear if the burden of figuring out what @@ means it would > be worth using, even for heavily numeric code. I would certainly prefer to > encounter norm(A), inv(A), matrix_power(A, n), fractional_matrix_power(A, > n) and expm(A) rather than their infix equivalents. It will certainly not > be obvious which of these @@ will support for objects from any given > library. > > One useful data point might be to consider whether matrix power is > available as an infix operator in other languages commonly used for > numerical work. AFAICT from some quick searches: > MATLAB: Yes > R: No > IDL: No > > All of these languages do, of course, implement infix matrix > multiplication, but it is apparently not clear at all whether the matrix > power is useful. > > Best, > Stephan > > > > > On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau wrote: > >> 2014-03-15 11:18 GMT-04:00 Charles R Harris : >> >> >>> >>> >>> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: >>> >>>> Hi all, >>>> >>>> Here's the second thread for discussion about Guido's concerns about >>>> PEP 465. The issue here is that PEP 465 as currently written proposes >>>> two new operators, @ for matrix multiplication and @@ for matrix power >>>> (analogous to * and **): >>>> http://legacy.python.org/dev/peps/pep-0465/ >>>> >>>> The main thing we care about of course is @; I pushed for including @@ >>>> because I thought it was nicer to have than not, and I thought the >>>> analogy between * and ** might make the overall package more appealing >>>> to Guido's aesthetic sense. >>>> >>>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >>>> swayed if we think it's worth the trouble to make a solid case. >>>> >>>> Note that question now is *not*, how will @@ affect the reception of >>>> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >>>> For this discussion let's assume @ can be taken for granted, and that >>>> we can freely choose to either add @@ or not add @@ to the language. >>>> The question is: which do we think makes Python a better language (for >>>> us and in general)? >>>> >>>> Some thoughts to start us off: >>>> >>>> Here are the interesting use cases for @@ that I can think of: >>>> - 'vector @@ 2' gives the squared Euclidean length (because it's the >>>> same as vector @ vector). Kind of handy. >>>> - 'matrix @@ n' of course gives the matrix power, which is of marginal >>>> use but does come in handy sometimes, e.g., when looking at graph >>>> connectivity. >>>> - 'matrix @@ -1' provides a very transparent notation for translating >>>> textbook formulas (with all their inverses) into code. It's a bit >>>> unhelpful in practice, because (a) usually you should use solve(), and >>>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >>>> sometimes transparent notation may be important. (And in some cases, >>>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >>>> compiled into a call to solve() anyway.) >>>> >>>> (Did I miss any?) >>>> >>>> In practice it seems to me that the last use case is the one that's >>>> might matter a lot practice, but then again, it might not -- I'm not >>>> sure. For example, does anyone who teaches programming with numpy have >>>> a feeling about whether the existence of '@@ -1' would make a big >>>> difference to you and your students? (Alan? I know you were worried >>>> about losing the .I attribute on matrices if switching to ndarrays for >>>> teaching -- given that ndarray will probably not get a .I attribute, >>>> how much would the existence of @@ -1 affect you?) >>>> >>>> On a more technical level, Guido is worried about how @@'s precedence >>>> should work (and this is somewhat related to the other thread about >>>> @'s precedence and associativity, because he feels that if we end up >>>> giving @ and * different precedence, then that makes it much less >>>> clear what to do with @@, and reduces the strength of the */**/@/@@ >>>> analogy). In particular, if we want to argue for @@ then we'll need to >>>> figure out what expressions like >>>> a @@ b @@ c >>>> and >>>> a ** b @@ c >>>> and >>>> a @@ b ** c >>>> should do. >>>> >>>> A related question is what @@ should do if given an array as its right >>>> argument. In the current PEP, only integers are accepted, which rules >>>> out a bunch of the more complicated cases like a @@ b @@ c (at least >>>> assuming @@ is right-associative, like **, and I can't see why you'd >>>> want anything else). OTOH, in the brave new gufunc world, it >>>> technically would make sense to define @@ as being a gufunc with >>>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >>>> the "power" to be an array -- for example, we'd have: >>>> >>>> mat = randn(m, m) >>>> pow = range(n) >>>> result = gufunc_matrix_power(mat, pow) >>>> assert result.shape == (n, m, m) >>>> for i in xrange(n): >>>> assert np.all(result[i, :, :] == mat ** i) >>>> >>>> In this case, a @@ b @@ c would at least be a meaningful expression to >>>> write. OTOH it would be incredibly bizarre and useless, so probably >>>> no-one would ever write it. >>>> >>>> As far as these technical issues go, my guess is that the correct rule >>>> is that @@ should just have the same precedence and the same (right) >>>> associativity as **, and in practice no-one will ever write stuff like >>>> a @@ b @@ c. But if we want to argue for @@ we need to come to some >>>> consensus or another here. >>>> >>>> It's also possible the answer is "ugh, these issues are too >>>> complicated, we should defer this until later when we have more >>>> experience with @ and gufuncs and stuff". After all, I doubt anyone >>>> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >>>> there's a strong feeling that '@@ -1' will make a big difference in >>>> pedagogical contexts, then putting that off for years might be a >>>> mistake. >>>> >>>> >>> I don't have a strong feeling either way on '@@' . Matrix inverses are >>> pretty common in matrix expressions, but I don't know that the new operator >>> offers much advantage over a function call. The positive integer powers >>> might be useful in some domains, as others have pointed out, but >>> computational practice one would tend to factor the evaluation. >>> >>> Chuck >>> >> >> Personally I think it should go in, because: >> - it's useful (although marginally), as in the examples previously >> mentioned >> - it's what people will expect >> - it's the only reasonable use of @@ once @ makes it in >> >> As far as the details about precedence rules and what not... Yes, someone >> should think about them and come up with rules that make sense, but since >> it will be pretty much only be used in unambiguous situations, this >> shouldn't be a blocker. >> >> -=- Olivier >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Sat Mar 15 20:47:57 2014 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Sat, 15 Mar 2014 20:47:57 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 8:38 PM, wrote: > I think I wouldn't use anything like @@ often enough to remember it's > meaning. I'd rather see english names for anything that is not **very** > common. > > I find A@@-1 pretty ugly compared to inv(A) > A@@(-0.5) might be nice (do we have matrix_sqrt ?) > scipy.linalg.sqrtm: http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html Warren > Josef > > > > On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer wrote: > >> Speaking only for myself (and as someone who has regularly used matrix >> powers), I would not expect matrix power as @@ to follow from matrix >> multiplication as @. I do agree that matrix power is the only reasonable >> use for @@ (given @), but it's still not something I would be confident >> enough to know without looking up. >> >> We should keep in mind that each new operator imposes some (small) >> cognitive burden on everyone who encounters them for the first time, and, >> in this case, this will include a large fraction of all Python users, >> whether they do numerical computation or not. >> >> Guido has given us a tremendous gift in the form of @. Let's not insist >> on @@, when it is unclear if the burden of figuring out what @@ means it >> would be worth using, even for heavily numeric code. I would certainly >> prefer to encounter norm(A), inv(A), matrix_power(A, n), >> fractional_matrix_power(A, n) and expm(A) rather than their infix >> equivalents. It will certainly not be obvious which of these @@ will >> support for objects from any given library. >> >> One useful data point might be to consider whether matrix power is >> available as an infix operator in other languages commonly used for >> numerical work. AFAICT from some quick searches: >> MATLAB: Yes >> R: No >> IDL: No >> >> All of these languages do, of course, implement infix matrix >> multiplication, but it is apparently not clear at all whether the matrix >> power is useful. >> >> Best, >> Stephan >> >> >> >> >> On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau wrote: >> >>> 2014-03-15 11:18 GMT-04:00 Charles R Harris : >>> >>> >>>> >>>> >>>> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: >>>> >>>>> Hi all, >>>>> >>>>> Here's the second thread for discussion about Guido's concerns about >>>>> PEP 465. The issue here is that PEP 465 as currently written proposes >>>>> two new operators, @ for matrix multiplication and @@ for matrix power >>>>> (analogous to * and **): >>>>> http://legacy.python.org/dev/peps/pep-0465/ >>>>> >>>>> The main thing we care about of course is @; I pushed for including @@ >>>>> because I thought it was nicer to have than not, and I thought the >>>>> analogy between * and ** might make the overall package more appealing >>>>> to Guido's aesthetic sense. >>>>> >>>>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >>>>> swayed if we think it's worth the trouble to make a solid case. >>>>> >>>>> Note that question now is *not*, how will @@ affect the reception of >>>>> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >>>>> For this discussion let's assume @ can be taken for granted, and that >>>>> we can freely choose to either add @@ or not add @@ to the language. >>>>> The question is: which do we think makes Python a better language (for >>>>> us and in general)? >>>>> >>>>> Some thoughts to start us off: >>>>> >>>>> Here are the interesting use cases for @@ that I can think of: >>>>> - 'vector @@ 2' gives the squared Euclidean length (because it's the >>>>> same as vector @ vector). Kind of handy. >>>>> - 'matrix @@ n' of course gives the matrix power, which is of marginal >>>>> use but does come in handy sometimes, e.g., when looking at graph >>>>> connectivity. >>>>> - 'matrix @@ -1' provides a very transparent notation for translating >>>>> textbook formulas (with all their inverses) into code. It's a bit >>>>> unhelpful in practice, because (a) usually you should use solve(), and >>>>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >>>>> sometimes transparent notation may be important. (And in some cases, >>>>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >>>>> compiled into a call to solve() anyway.) >>>>> >>>>> (Did I miss any?) >>>>> >>>>> In practice it seems to me that the last use case is the one that's >>>>> might matter a lot practice, but then again, it might not -- I'm not >>>>> sure. For example, does anyone who teaches programming with numpy have >>>>> a feeling about whether the existence of '@@ -1' would make a big >>>>> difference to you and your students? (Alan? I know you were worried >>>>> about losing the .I attribute on matrices if switching to ndarrays for >>>>> teaching -- given that ndarray will probably not get a .I attribute, >>>>> how much would the existence of @@ -1 affect you?) >>>>> >>>>> On a more technical level, Guido is worried about how @@'s precedence >>>>> should work (and this is somewhat related to the other thread about >>>>> @'s precedence and associativity, because he feels that if we end up >>>>> giving @ and * different precedence, then that makes it much less >>>>> clear what to do with @@, and reduces the strength of the */**/@/@@ >>>>> analogy). In particular, if we want to argue for @@ then we'll need to >>>>> figure out what expressions like >>>>> a @@ b @@ c >>>>> and >>>>> a ** b @@ c >>>>> and >>>>> a @@ b ** c >>>>> should do. >>>>> >>>>> A related question is what @@ should do if given an array as its right >>>>> argument. In the current PEP, only integers are accepted, which rules >>>>> out a bunch of the more complicated cases like a @@ b @@ c (at least >>>>> assuming @@ is right-associative, like **, and I can't see why you'd >>>>> want anything else). OTOH, in the brave new gufunc world, it >>>>> technically would make sense to define @@ as being a gufunc with >>>>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >>>>> the "power" to be an array -- for example, we'd have: >>>>> >>>>> mat = randn(m, m) >>>>> pow = range(n) >>>>> result = gufunc_matrix_power(mat, pow) >>>>> assert result.shape == (n, m, m) >>>>> for i in xrange(n): >>>>> assert np.all(result[i, :, :] == mat ** i) >>>>> >>>>> In this case, a @@ b @@ c would at least be a meaningful expression to >>>>> write. OTOH it would be incredibly bizarre and useless, so probably >>>>> no-one would ever write it. >>>>> >>>>> As far as these technical issues go, my guess is that the correct rule >>>>> is that @@ should just have the same precedence and the same (right) >>>>> associativity as **, and in practice no-one will ever write stuff like >>>>> a @@ b @@ c. But if we want to argue for @@ we need to come to some >>>>> consensus or another here. >>>>> >>>>> It's also possible the answer is "ugh, these issues are too >>>>> complicated, we should defer this until later when we have more >>>>> experience with @ and gufuncs and stuff". After all, I doubt anyone >>>>> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >>>>> there's a strong feeling that '@@ -1' will make a big difference in >>>>> pedagogical contexts, then putting that off for years might be a >>>>> mistake. >>>>> >>>>> >>>> I don't have a strong feeling either way on '@@' . Matrix inverses are >>>> pretty common in matrix expressions, but I don't know that the new operator >>>> offers much advantage over a function call. The positive integer powers >>>> might be useful in some domains, as others have pointed out, but >>>> computational practice one would tend to factor the evaluation. >>>> >>>> Chuck >>>> >>> >>> Personally I think it should go in, because: >>> - it's useful (although marginally), as in the examples previously >>> mentioned >>> - it's what people will expect >>> - it's the only reasonable use of @@ once @ makes it in >>> >>> As far as the details about precedence rules and what not... Yes, >>> someone should think about them and come up with rules that make sense, but >>> since it will be pretty much only be used in unambiguous situations, this >>> shouldn't be a blocker. >>> >>> -=- Olivier >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 15 21:20:40 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 15 Mar 2014 21:20:40 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: > Hi all, > > Here's the main blocker for adding a matrix multiply operator '@' to > Python: we need to decide what we think its precedence and associativity > should be. I'll explain what that means so we're on the same page, and what > the choices are, and then we can all argue about it. But even better would > be if we could get some data to guide our decision, and this would be a lot > easier if some of you all can help; I'll suggest some ways you might be > able to do that. > > So! Precedence and left- versus right-associativity. If you already know > what these are you can skim down until you see CAPITAL LETTERS. > > We all know what precedence is. Code like this: > a + b * c > gets evaluated as: > a + (b * c) > because * has higher precedence than +. It "binds more tightly", as they > say. Python's complete precedence able is here: > http://docs.python.org/3/reference/expressions.html#operator-precedence > > Associativity, in the parsing sense, is less well known, though it's just > as important. It's about deciding how to evaluate code like this: > a * b * c > Do we use > a * (b * c) # * is "right associative" > or > (a * b) * c # * is "left associative" > ? Here all the operators have the same precedence (because, uh... they're > the same operator), so precedence doesn't help. And mostly we can ignore > this in day-to-day life, because both versions give the same answer, so who > cares. But a programming language has to pick one (consider what happens if > one of those objects has a non-default __mul__ implementation). And of > course it matters a lot for non-associative operations like > a - b - c > or > a / b / c > So when figuring out order of evaluations, what you do first is check the > precedence, and then if you have multiple operators next to each other with > the same precedence, you check their associativity. Notice that this means > that if you have different operators that share the same precedence level > (like + and -, or * and /), then they have to all have the same > associativity. All else being equal, it's generally considered nice to have > fewer precedence levels, because these have to be memorized by users. > > Right now in Python, every precedence level is left-associative, except > for '**'. If you write these formulas without any parentheses, then what > the interpreter will actually execute is: > (a * b) * c > (a - b) - c > (a / b) / c > but > a ** (b ** c) > > Okay, that's the background. Here's the question. We need to decide on > precedence and associativity for '@'. In particular, there are three > different options that are interesting: > > OPTION 1 FOR @: > Precedence: same as * > Associativity: left > My shorthand name for it: "same-left" (yes, very creative) > > This means that if you don't use parentheses, you get: > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> (a @ b) * c > > OPTION 2 FOR @: > Precedence: more-weakly-binding than * > Associativity: right > My shorthand name for it: "weak-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> (a * b) @ c > a @ b * c -> a @ (b * c) > > OPTION 3 FOR @: > Precedence: more-tightly-binding than * > Associativity: right > My shorthand name for it: "tight-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> a * (b @ c) > a @ b * c -> (a @ b) * c > > We need to pick which of which options we think is best, based on whatever > reasons we can think of, ideally more than "hmm, weak-right gives me warm > fuzzy feelings" ;-). (In principle the other 2 possible options are > tight-left and weak-left, but there doesn't seem to be any argument in > favor of either, so we'll leave them out of the discussion.) > > Some things to consider: > > * and @ are actually not associative (in the math sense) with respect to > each other, i.e., (a * b) @ c and a * (b @ c) in general give different > results when 'a' is not a scalar. So considering the two expressions 'a * b > @ c' and 'a @ b * c', we can see that each of these three options gives > produces different results in some cases. > > "Same-left" is the easiest to explain and remember, because it's just, "@ > acts like * and /". So we already have to know the rule in order to > understand other non-associative expressions like a / b / c or a - b - c, > and it'd be nice if the same rule applied to things like a * b @ c so we > only had to memorize *one* rule. (Of course there's ** which uses the > opposite rule, but I guess everyone internalized that one in secondary > school; that's not true for * versus @.) This is definitely the default we > should choose unless we have a good reason to do otherwise. > > BUT: there might indeed be a good reason to do otherwise, which is the > whole reason this has come up. Consider: > Mat1 @ Mat2 @ vec > Obviously this will execute much more quickly if we do > Mat1 @ (Mat2 @ vec) > because that results in two cheap matrix-vector multiplies, while > (Mat1 @ Mat2) @ vec > starts out by doing an expensive matrix-matrix multiply. So: maybe @ > should be right associative, so that we get the fast behaviour without > having to use explicit parentheses! /If/ these kinds of expressions are > common enough that having to remember to put explicit parentheses in all > the time is more of a programmer burden than having to memorize a special > associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec > @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in > practice -- I don't know. > > Also, if we do want @ to be right associative, then I can't think of any > clever reasons to prefer weak-right over tight-right, or vice-versa. For > the scalar multiplication case, I believe both options produce the same > result in the same amount of time. For the non-scalar case, they give > different answers. Do people have strong intuitions about what expressions > like > a * b @ c > a @ b * c > should do actually? (I'm guessing not, but hey, you never know.) > > And, while intuition is useful, it would be really *really* nice to be > basing these decisions on more than *just* intuition, since whatever we > decide will be subtly influencing the experience of writing linear algebra > code in Python for the rest of time. So here's where I could use some help. > First, of course, if you have any other reasons why one or the other of > these options is better, then please share! But second, I think we need to > know something about how often the Mat @ Mat @ vec type cases arise in > practice. How often do non-scalar * and np.dot show up in the same > expression? How often does it look like a * np.dot(b, c), and how often > does it look like np.dot(a * b, c)? How often do we see expressions like > np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, > np.dot(b, c))? This would really help guide the debate. I don't have this > data, and I'm not sure the best way to get it. A super-fancy approach would > be to write a little script that uses the 'ast' module to count things > automatically. A less fancy approach would be to just pick some code you've > written, or a well-known package, grep through for calls to 'dot', and make > notes on what you see. (An advantage of the less-fancy approach is that as > a human you might be able to tell the difference between scalar and > non-scalar *, or check whether it actually matters what order the 'dot' > calls are done in.) > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > I'm in favor of same-left because it's the easiest to remember. with scalar factors it is how I read formulas. Both calculating dot @ first or calculating elementwise * first sound logical, but I wouldn't know which should go first. (My "feeling" would be @ first.) two cases I remembered in statsmodels H = np.dot(results.model.pinv_wexog, scale[:,None] * results.model.pinv_wexog.T) se = (exog * np.dot(covb, exog.T).T).sum(1) we are mixing * and dot pretty freely in all combinations AFAIR my guess is that I wouldn't trust any sequence without parenthesis for a long time. (and I don't trust a sequence of dots @ without parenthesis either, in our applications.) x @ (W.T @ W) @ x ( W.shape = (10000, 5) ) or x * (W.T @ W) * x (w * x) @ x weighted sum of squares Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 15 21:31:22 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 15 Mar 2014 21:31:22 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 8:47 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > On Sat, Mar 15, 2014 at 8:38 PM, wrote: > >> I think I wouldn't use anything like @@ often enough to remember it's >> meaning. I'd rather see english names for anything that is not **very** >> common. >> >> I find A@@-1 pretty ugly compared to inv(A) >> A@@(-0.5) might be nice (do we have matrix_sqrt ?) >> > > > scipy.linalg.sqrtm: > http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html > maybe a good example: I could never figured that one out M = sqrtm(A) A = M @ M but what we use in stats is A = R.T @ R (eigenvectors dot diag(sqrt of eigenvalues) which sqrt is A@@(0.5) ? Josef > > > Warren > > > >> Josef >> >> >> >> On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer wrote: >> >>> Speaking only for myself (and as someone who has regularly used matrix >>> powers), I would not expect matrix power as @@ to follow from matrix >>> multiplication as @. I do agree that matrix power is the only reasonable >>> use for @@ (given @), but it's still not something I would be confident >>> enough to know without looking up. >>> >>> We should keep in mind that each new operator imposes some (small) >>> cognitive burden on everyone who encounters them for the first time, and, >>> in this case, this will include a large fraction of all Python users, >>> whether they do numerical computation or not. >>> >>> Guido has given us a tremendous gift in the form of @. Let's not insist >>> on @@, when it is unclear if the burden of figuring out what @@ means it >>> would be worth using, even for heavily numeric code. I would certainly >>> prefer to encounter norm(A), inv(A), matrix_power(A, n), >>> fractional_matrix_power(A, n) and expm(A) rather than their infix >>> equivalents. It will certainly not be obvious which of these @@ will >>> support for objects from any given library. >>> >>> One useful data point might be to consider whether matrix power is >>> available as an infix operator in other languages commonly used for >>> numerical work. AFAICT from some quick searches: >>> MATLAB: Yes >>> R: No >>> IDL: No >>> >>> All of these languages do, of course, implement infix matrix >>> multiplication, but it is apparently not clear at all whether the matrix >>> power is useful. >>> >>> Best, >>> Stephan >>> >>> >>> >>> >>> On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau wrote: >>> >>>> 2014-03-15 11:18 GMT-04:00 Charles R Harris >>>> : >>>> >>>> >>>>> >>>>> >>>>> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Here's the second thread for discussion about Guido's concerns about >>>>>> PEP 465. The issue here is that PEP 465 as currently written proposes >>>>>> two new operators, @ for matrix multiplication and @@ for matrix power >>>>>> (analogous to * and **): >>>>>> http://legacy.python.org/dev/peps/pep-0465/ >>>>>> >>>>>> The main thing we care about of course is @; I pushed for including @@ >>>>>> because I thought it was nicer to have than not, and I thought the >>>>>> analogy between * and ** might make the overall package more appealing >>>>>> to Guido's aesthetic sense. >>>>>> >>>>>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >>>>>> swayed if we think it's worth the trouble to make a solid case. >>>>>> >>>>>> Note that question now is *not*, how will @@ affect the reception of >>>>>> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >>>>>> For this discussion let's assume @ can be taken for granted, and that >>>>>> we can freely choose to either add @@ or not add @@ to the language. >>>>>> The question is: which do we think makes Python a better language (for >>>>>> us and in general)? >>>>>> >>>>>> Some thoughts to start us off: >>>>>> >>>>>> Here are the interesting use cases for @@ that I can think of: >>>>>> - 'vector @@ 2' gives the squared Euclidean length (because it's the >>>>>> same as vector @ vector). Kind of handy. >>>>>> - 'matrix @@ n' of course gives the matrix power, which is of marginal >>>>>> use but does come in handy sometimes, e.g., when looking at graph >>>>>> connectivity. >>>>>> - 'matrix @@ -1' provides a very transparent notation for translating >>>>>> textbook formulas (with all their inverses) into code. It's a bit >>>>>> unhelpful in practice, because (a) usually you should use solve(), and >>>>>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >>>>>> sometimes transparent notation may be important. (And in some cases, >>>>>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >>>>>> compiled into a call to solve() anyway.) >>>>>> >>>>>> (Did I miss any?) >>>>>> >>>>>> In practice it seems to me that the last use case is the one that's >>>>>> might matter a lot practice, but then again, it might not -- I'm not >>>>>> sure. For example, does anyone who teaches programming with numpy have >>>>>> a feeling about whether the existence of '@@ -1' would make a big >>>>>> difference to you and your students? (Alan? I know you were worried >>>>>> about losing the .I attribute on matrices if switching to ndarrays for >>>>>> teaching -- given that ndarray will probably not get a .I attribute, >>>>>> how much would the existence of @@ -1 affect you?) >>>>>> >>>>>> On a more technical level, Guido is worried about how @@'s precedence >>>>>> should work (and this is somewhat related to the other thread about >>>>>> @'s precedence and associativity, because he feels that if we end up >>>>>> giving @ and * different precedence, then that makes it much less >>>>>> clear what to do with @@, and reduces the strength of the */**/@/@@ >>>>>> analogy). In particular, if we want to argue for @@ then we'll need to >>>>>> figure out what expressions like >>>>>> a @@ b @@ c >>>>>> and >>>>>> a ** b @@ c >>>>>> and >>>>>> a @@ b ** c >>>>>> should do. >>>>>> >>>>>> A related question is what @@ should do if given an array as its right >>>>>> argument. In the current PEP, only integers are accepted, which rules >>>>>> out a bunch of the more complicated cases like a @@ b @@ c (at least >>>>>> assuming @@ is right-associative, like **, and I can't see why you'd >>>>>> want anything else). OTOH, in the brave new gufunc world, it >>>>>> technically would make sense to define @@ as being a gufunc with >>>>>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >>>>>> the "power" to be an array -- for example, we'd have: >>>>>> >>>>>> mat = randn(m, m) >>>>>> pow = range(n) >>>>>> result = gufunc_matrix_power(mat, pow) >>>>>> assert result.shape == (n, m, m) >>>>>> for i in xrange(n): >>>>>> assert np.all(result[i, :, :] == mat ** i) >>>>>> >>>>>> In this case, a @@ b @@ c would at least be a meaningful expression to >>>>>> write. OTOH it would be incredibly bizarre and useless, so probably >>>>>> no-one would ever write it. >>>>>> >>>>>> As far as these technical issues go, my guess is that the correct rule >>>>>> is that @@ should just have the same precedence and the same (right) >>>>>> associativity as **, and in practice no-one will ever write stuff like >>>>>> a @@ b @@ c. But if we want to argue for @@ we need to come to some >>>>>> consensus or another here. >>>>>> >>>>>> It's also possible the answer is "ugh, these issues are too >>>>>> complicated, we should defer this until later when we have more >>>>>> experience with @ and gufuncs and stuff". After all, I doubt anyone >>>>>> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >>>>>> there's a strong feeling that '@@ -1' will make a big difference in >>>>>> pedagogical contexts, then putting that off for years might be a >>>>>> mistake. >>>>>> >>>>>> >>>>> I don't have a strong feeling either way on '@@' . Matrix inverses are >>>>> pretty common in matrix expressions, but I don't know that the new operator >>>>> offers much advantage over a function call. The positive integer powers >>>>> might be useful in some domains, as others have pointed out, but >>>>> computational practice one would tend to factor the evaluation. >>>>> >>>>> Chuck >>>>> >>>> >>>> Personally I think it should go in, because: >>>> - it's useful (although marginally), as in the examples previously >>>> mentioned >>>> - it's what people will expect >>>> - it's the only reasonable use of @@ once @ makes it in >>>> >>>> As far as the details about precedence rules and what not... Yes, >>>> someone should think about them and come up with rules that make sense, but >>>> since it will be pretty much only be used in unambiguous situations, this >>>> shouldn't be a blocker. >>>> >>>> -=- Olivier >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 15 22:12:08 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Mar 2014 02:12:08 +0000 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: <532451E2.5000205@gmail.com> References: <532451E2.5000205@gmail.com> Message-ID: On Sat, Mar 15, 2014 at 1:13 PM, Alan G Isaac wrote: > On 3/15/2014 12:32 AM, Nathaniel Smith wrote: >> I know you were worried >> about losing the .I attribute on matrices if switching to ndarrays for >> teaching -- given that ndarray will probably not get a .I attribute, >> how much would the existence of @@ -1 affect you? > > Not much. Positive integer powers would be useful > (for illustrating e.g. graph theory and difference equations), > but not enough to delay the PEP. So to be clear, even if numpy.matrix is going away, and even if ndarray isn't getting a .I attribute, then you're just as happy typing/teaching inv(X) as X @@ -1? > I think NumPy should "take the money and run". > Getting `@` is great. Let's get experience with > it before deciding whether it's worth asking for `@@`. > > Questions for `@@`: > - would it just be `matrix_power`, with all the restrictions? > - or would `a(10,2,2)@@-1` return an array of matrix inverses? > - etc The version in the PEP does do gufunc-style broadcasting for >2d arrays, yes. So will np.linalg.matrix_power as soon as someone bothers to send a patch ;-) > In the end, I'd like to see a functional implementation before > deciding on `@@`, but I would not like to see `@` delayed at all. Oh, well, not much is going to affect `@`'s timing, unless we're *dreadfully* slow. Py 3.5 isn't even scheduled yet b/c 3.4 isn't out, and IIUC Python's standard release cycle is 18 months. So we've got a year+ before feature freeze, regardless. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From charlesr.harris at gmail.com Sat Mar 15 23:30:49 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 21:30:49 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 7:20 PM, wrote: > > > > On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: > >> Hi all, >> >> Here's the main blocker for adding a matrix multiply operator '@' to >> Python: we need to decide what we think its precedence and associativity >> should be. I'll explain what that means so we're on the same page, and what >> the choices are, and then we can all argue about it. But even better would >> be if we could get some data to guide our decision, and this would be a lot >> easier if some of you all can help; I'll suggest some ways you might be >> able to do that. >> >> So! Precedence and left- versus right-associativity. If you already know >> what these are you can skim down until you see CAPITAL LETTERS. >> >> We all know what precedence is. Code like this: >> a + b * c >> gets evaluated as: >> a + (b * c) >> because * has higher precedence than +. It "binds more tightly", as they >> say. Python's complete precedence able is here: >> http://docs.python.org/3/reference/expressions.html#operator-precedence >> >> Associativity, in the parsing sense, is less well known, though it's just >> as important. It's about deciding how to evaluate code like this: >> a * b * c >> Do we use >> a * (b * c) # * is "right associative" >> or >> (a * b) * c # * is "left associative" >> ? Here all the operators have the same precedence (because, uh... they're >> the same operator), so precedence doesn't help. And mostly we can ignore >> this in day-to-day life, because both versions give the same answer, so who >> cares. But a programming language has to pick one (consider what happens if >> one of those objects has a non-default __mul__ implementation). And of >> course it matters a lot for non-associative operations like >> a - b - c >> or >> a / b / c >> So when figuring out order of evaluations, what you do first is check the >> precedence, and then if you have multiple operators next to each other with >> the same precedence, you check their associativity. Notice that this means >> that if you have different operators that share the same precedence level >> (like + and -, or * and /), then they have to all have the same >> associativity. All else being equal, it's generally considered nice to have >> fewer precedence levels, because these have to be memorized by users. >> >> Right now in Python, every precedence level is left-associative, except >> for '**'. If you write these formulas without any parentheses, then what >> the interpreter will actually execute is: >> (a * b) * c >> (a - b) - c >> (a / b) / c >> but >> a ** (b ** c) >> >> Okay, that's the background. Here's the question. We need to decide on >> precedence and associativity for '@'. In particular, there are three >> different options that are interesting: >> >> OPTION 1 FOR @: >> Precedence: same as * >> Associativity: left >> My shorthand name for it: "same-left" (yes, very creative) >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> (a @ b) @ c >> a * b @ c -> (a * b) @ c >> a @ b * c -> (a @ b) * c >> >> OPTION 2 FOR @: >> Precedence: more-weakly-binding than * >> Associativity: right >> My shorthand name for it: "weak-right" >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> a @ (b @ c) >> a * b @ c -> (a * b) @ c >> a @ b * c -> a @ (b * c) >> >> OPTION 3 FOR @: >> Precedence: more-tightly-binding than * >> Associativity: right >> My shorthand name for it: "tight-right" >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> a @ (b @ c) >> a * b @ c -> a * (b @ c) >> a @ b * c -> (a @ b) * c >> >> We need to pick which of which options we think is best, based on >> whatever reasons we can think of, ideally more than "hmm, weak-right gives >> me warm fuzzy feelings" ;-). (In principle the other 2 possible options are >> tight-left and weak-left, but there doesn't seem to be any argument in >> favor of either, so we'll leave them out of the discussion.) >> >> Some things to consider: >> >> * and @ are actually not associative (in the math sense) with respect to >> each other, i.e., (a * b) @ c and a * (b @ c) in general give different >> results when 'a' is not a scalar. So considering the two expressions 'a * b >> @ c' and 'a @ b * c', we can see that each of these three options gives >> produces different results in some cases. >> >> "Same-left" is the easiest to explain and remember, because it's just, "@ >> acts like * and /". So we already have to know the rule in order to >> understand other non-associative expressions like a / b / c or a - b - c, >> and it'd be nice if the same rule applied to things like a * b @ c so we >> only had to memorize *one* rule. (Of course there's ** which uses the >> opposite rule, but I guess everyone internalized that one in secondary >> school; that's not true for * versus @.) This is definitely the default we >> should choose unless we have a good reason to do otherwise. >> >> BUT: there might indeed be a good reason to do otherwise, which is the >> whole reason this has come up. Consider: >> Mat1 @ Mat2 @ vec >> Obviously this will execute much more quickly if we do >> Mat1 @ (Mat2 @ vec) >> because that results in two cheap matrix-vector multiplies, while >> (Mat1 @ Mat2) @ vec >> starts out by doing an expensive matrix-matrix multiply. So: maybe @ >> should be right associative, so that we get the fast behaviour without >> having to use explicit parentheses! /If/ these kinds of expressions are >> common enough that having to remember to put explicit parentheses in all >> the time is more of a programmer burden than having to memorize a special >> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec >> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in >> practice -- I don't know. >> >> Also, if we do want @ to be right associative, then I can't think of any >> clever reasons to prefer weak-right over tight-right, or vice-versa. For >> the scalar multiplication case, I believe both options produce the same >> result in the same amount of time. For the non-scalar case, they give >> different answers. Do people have strong intuitions about what expressions >> like >> a * b @ c >> a @ b * c >> should do actually? (I'm guessing not, but hey, you never know.) >> >> And, while intuition is useful, it would be really *really* nice to be >> basing these decisions on more than *just* intuition, since whatever we >> decide will be subtly influencing the experience of writing linear algebra >> code in Python for the rest of time. So here's where I could use some help. >> First, of course, if you have any other reasons why one or the other of >> these options is better, then please share! But second, I think we need to >> know something about how often the Mat @ Mat @ vec type cases arise in >> practice. How often do non-scalar * and np.dot show up in the same >> expression? How often does it look like a * np.dot(b, c), and how often >> does it look like np.dot(a * b, c)? How often do we see expressions like >> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, >> np.dot(b, c))? This would really help guide the debate. I don't have this >> data, and I'm not sure the best way to get it. A super-fancy approach would >> be to write a little script that uses the 'ast' module to count things >> automatically. A less fancy approach would be to just pick some code you've >> written, or a well-known package, grep through for calls to 'dot', and make >> notes on what you see. (An advantage of the less-fancy approach is that as >> a human you might be able to tell the difference between scalar and >> non-scalar *, or check whether it actually matters what order the 'dot' >> calls are done in.) >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > I'm in favor of same-left because it's the easiest to remember. > with scalar factors it is how I read formulas. > Note that if there are no (interior) vectors involved then the two methods of association give theoretically identical results. But when there is a vector on the right and no vector on the left, then right association is more efficient and likely more numerically accurate. > Both calculating dot @ first or calculating elementwise * first sound > logical, but I wouldn't know which should go first. (My "feeling" would be > @ first.) > > > two cases I remembered in statsmodels > H = np.dot(results.model.pinv_wexog, scale[:,None] * > results.model.pinv_wexog.T) > se = (exog * np.dot(covb, exog.T).T).sum(1) > > we are mixing * and dot pretty freely in all combinations AFAIR > > my guess is that I wouldn't trust any sequence without parenthesis for a > long time. > (and I don't trust a sequence of dots @ without parenthesis either, in our > applications.) > > x @ (W.T @ W) @ x ( W.shape = (10000, 5) ) > or > x * (W.T @ W) * x > > Judicious use of parenthesis is definitely recommended no matter what is decided. > (w * x) @ x weighted sum of squares > > Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Mar 16 00:53:41 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 16 Mar 2014 00:53:41 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 11:30 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > > On Sat, Mar 15, 2014 at 7:20 PM, wrote: > >> >> >> >> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: >> >>> Hi all, >>> >>> Here's the main blocker for adding a matrix multiply operator '@' to >>> Python: we need to decide what we think its precedence and associativity >>> should be. I'll explain what that means so we're on the same page, and what >>> the choices are, and then we can all argue about it. But even better would >>> be if we could get some data to guide our decision, and this would be a lot >>> easier if some of you all can help; I'll suggest some ways you might be >>> able to do that. >>> >>> So! Precedence and left- versus right-associativity. If you already know >>> what these are you can skim down until you see CAPITAL LETTERS. >>> >>> We all know what precedence is. Code like this: >>> a + b * c >>> gets evaluated as: >>> a + (b * c) >>> because * has higher precedence than +. It "binds more tightly", as they >>> say. Python's complete precedence able is here: >>> >>> http://docs.python.org/3/reference/expressions.html#operator-precedence >>> >>> Associativity, in the parsing sense, is less well known, though it's >>> just as important. It's about deciding how to evaluate code like this: >>> a * b * c >>> Do we use >>> a * (b * c) # * is "right associative" >>> or >>> (a * b) * c # * is "left associative" >>> ? Here all the operators have the same precedence (because, uh... >>> they're the same operator), so precedence doesn't help. And mostly we can >>> ignore this in day-to-day life, because both versions give the same answer, >>> so who cares. But a programming language has to pick one (consider what >>> happens if one of those objects has a non-default __mul__ implementation). >>> And of course it matters a lot for non-associative operations like >>> a - b - c >>> or >>> a / b / c >>> So when figuring out order of evaluations, what you do first is check >>> the precedence, and then if you have multiple operators next to each other >>> with the same precedence, you check their associativity. Notice that this >>> means that if you have different operators that share the same precedence >>> level (like + and -, or * and /), then they have to all have the same >>> associativity. All else being equal, it's generally considered nice to have >>> fewer precedence levels, because these have to be memorized by users. >>> >>> Right now in Python, every precedence level is left-associative, except >>> for '**'. If you write these formulas without any parentheses, then what >>> the interpreter will actually execute is: >>> (a * b) * c >>> (a - b) - c >>> (a / b) / c >>> but >>> a ** (b ** c) >>> >>> Okay, that's the background. Here's the question. We need to decide on >>> precedence and associativity for '@'. In particular, there are three >>> different options that are interesting: >>> >>> OPTION 1 FOR @: >>> Precedence: same as * >>> Associativity: left >>> My shorthand name for it: "same-left" (yes, very creative) >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> (a @ b) @ c >>> a * b @ c -> (a * b) @ c >>> a @ b * c -> (a @ b) * c >>> >>> OPTION 2 FOR @: >>> Precedence: more-weakly-binding than * >>> Associativity: right >>> My shorthand name for it: "weak-right" >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> a @ (b @ c) >>> a * b @ c -> (a * b) @ c >>> a @ b * c -> a @ (b * c) >>> >>> OPTION 3 FOR @: >>> Precedence: more-tightly-binding than * >>> Associativity: right >>> My shorthand name for it: "tight-right" >>> >>> This means that if you don't use parentheses, you get: >>> a @ b @ c -> a @ (b @ c) >>> a * b @ c -> a * (b @ c) >>> a @ b * c -> (a @ b) * c >>> >>> We need to pick which of which options we think is best, based on >>> whatever reasons we can think of, ideally more than "hmm, weak-right gives >>> me warm fuzzy feelings" ;-). (In principle the other 2 possible options are >>> tight-left and weak-left, but there doesn't seem to be any argument in >>> favor of either, so we'll leave them out of the discussion.) >>> >>> Some things to consider: >>> >>> * and @ are actually not associative (in the math sense) with respect to >>> each other, i.e., (a * b) @ c and a * (b @ c) in general give different >>> results when 'a' is not a scalar. So considering the two expressions 'a * b >>> @ c' and 'a @ b * c', we can see that each of these three options gives >>> produces different results in some cases. >>> >>> "Same-left" is the easiest to explain and remember, because it's just, >>> "@ acts like * and /". So we already have to know the rule in order to >>> understand other non-associative expressions like a / b / c or a - b - c, >>> and it'd be nice if the same rule applied to things like a * b @ c so we >>> only had to memorize *one* rule. (Of course there's ** which uses the >>> opposite rule, but I guess everyone internalized that one in secondary >>> school; that's not true for * versus @.) This is definitely the default we >>> should choose unless we have a good reason to do otherwise. >>> >>> BUT: there might indeed be a good reason to do otherwise, which is the >>> whole reason this has come up. Consider: >>> Mat1 @ Mat2 @ vec >>> Obviously this will execute much more quickly if we do >>> Mat1 @ (Mat2 @ vec) >>> because that results in two cheap matrix-vector multiplies, while >>> (Mat1 @ Mat2) @ vec >>> starts out by doing an expensive matrix-matrix multiply. So: maybe @ >>> should be right associative, so that we get the fast behaviour without >>> having to use explicit parentheses! /If/ these kinds of expressions are >>> common enough that having to remember to put explicit parentheses in all >>> the time is more of a programmer burden than having to memorize a special >>> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec >>> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in >>> practice -- I don't know. >>> >>> Also, if we do want @ to be right associative, then I can't think of any >>> clever reasons to prefer weak-right over tight-right, or vice-versa. For >>> the scalar multiplication case, I believe both options produce the same >>> result in the same amount of time. For the non-scalar case, they give >>> different answers. Do people have strong intuitions about what expressions >>> like >>> a * b @ c >>> a @ b * c >>> should do actually? (I'm guessing not, but hey, you never know.) >>> >>> And, while intuition is useful, it would be really *really* nice to be >>> basing these decisions on more than *just* intuition, since whatever we >>> decide will be subtly influencing the experience of writing linear algebra >>> code in Python for the rest of time. So here's where I could use some help. >>> First, of course, if you have any other reasons why one or the other of >>> these options is better, then please share! But second, I think we need to >>> know something about how often the Mat @ Mat @ vec type cases arise in >>> practice. How often do non-scalar * and np.dot show up in the same >>> expression? How often does it look like a * np.dot(b, c), and how often >>> does it look like np.dot(a * b, c)? How often do we see expressions like >>> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, >>> np.dot(b, c))? This would really help guide the debate. I don't have this >>> data, and I'm not sure the best way to get it. A super-fancy approach would >>> be to write a little script that uses the 'ast' module to count things >>> automatically. A less fancy approach would be to just pick some code you've >>> written, or a well-known package, grep through for calls to 'dot', and make >>> notes on what you see. (An advantage of the less-fancy approach is that as >>> a human you might be able to tell the difference between scalar and >>> non-scalar *, or check whether it actually matters what order the 'dot' >>> calls are done in.) >>> >>> -n >>> >>> -- >>> Nathaniel J. Smith >>> Postdoctoral researcher - Informatics - University of Edinburgh >>> http://vorpus.org >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> I'm in favor of same-left because it's the easiest to remember. >> with scalar factors it is how I read formulas. >> > > Note that if there are no (interior) vectors involved then the two methods > of association give theoretically identical results. But when there is a > vector on the right and no vector on the left, then right association is > more efficient and likely more numerically accurate. > What's so special about a vector on the right? What if I have a vector on the left, or, as is pretty common, a quadratic form? having a different associative rule between a * b * c and A @ B @ C looks confusing to me. there is something special about the last array (in numpy) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(10).reshape(5, 2, order="F")) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(5)) np.arange(5).dot(np.diag(np.ones(5))).dot(np.arange(5).reshape(5, 1)) chains go left to right Josef > > >> Both calculating dot @ first or calculating elementwise * first sound >> logical, but I wouldn't know which should go first. (My "feeling" would be >> @ first.) >> >> >> two cases I remembered in statsmodels >> H = np.dot(results.model.pinv_wexog, scale[:,None] * >> results.model.pinv_wexog.T) >> se = (exog * np.dot(covb, exog.T).T).sum(1) >> >> we are mixing * and dot pretty freely in all combinations AFAIR >> >> my guess is that I wouldn't trust any sequence without parenthesis for a >> long time. >> (and I don't trust a sequence of dots @ without parenthesis either, in >> our applications.) >> >> x @ (W.T @ W) @ x ( W.shape = (10000, 5) ) >> or >> x * (W.T @ W) * x >> >> > Judicious use of parenthesis is definitely recommended no matter what is > decided. > > >> (w * x) @ x weighted sum of squares >> >> > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Mar 16 01:23:40 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 15 Mar 2014 23:23:40 -0600 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 10:53 PM, wrote: > > > > On Sat, Mar 15, 2014 at 11:30 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> >> On Sat, Mar 15, 2014 at 7:20 PM, wrote: >> >>> >>> >>> >>> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: >>> >>>> Hi all, >>>> >>>> Here's the main blocker for adding a matrix multiply operator '@' to >>>> Python: we need to decide what we think its precedence and associativity >>>> should be. I'll explain what that means so we're on the same page, and what >>>> the choices are, and then we can all argue about it. But even better would >>>> be if we could get some data to guide our decision, and this would be a lot >>>> easier if some of you all can help; I'll suggest some ways you might be >>>> able to do that. >>>> >>>> So! Precedence and left- versus right-associativity. If you already >>>> know what these are you can skim down until you see CAPITAL LETTERS. >>>> >>>> We all know what precedence is. Code like this: >>>> a + b * c >>>> gets evaluated as: >>>> a + (b * c) >>>> because * has higher precedence than +. It "binds more tightly", as >>>> they say. Python's complete precedence able is here: >>>> >>>> http://docs.python.org/3/reference/expressions.html#operator-precedence >>>> >>>> Associativity, in the parsing sense, is less well known, though it's >>>> just as important. It's about deciding how to evaluate code like this: >>>> a * b * c >>>> Do we use >>>> a * (b * c) # * is "right associative" >>>> or >>>> (a * b) * c # * is "left associative" >>>> ? Here all the operators have the same precedence (because, uh... >>>> they're the same operator), so precedence doesn't help. And mostly we can >>>> ignore this in day-to-day life, because both versions give the same answer, >>>> so who cares. But a programming language has to pick one (consider what >>>> happens if one of those objects has a non-default __mul__ implementation). >>>> And of course it matters a lot for non-associative operations like >>>> a - b - c >>>> or >>>> a / b / c >>>> So when figuring out order of evaluations, what you do first is check >>>> the precedence, and then if you have multiple operators next to each other >>>> with the same precedence, you check their associativity. Notice that this >>>> means that if you have different operators that share the same precedence >>>> level (like + and -, or * and /), then they have to all have the same >>>> associativity. All else being equal, it's generally considered nice to have >>>> fewer precedence levels, because these have to be memorized by users. >>>> >>>> Right now in Python, every precedence level is left-associative, except >>>> for '**'. If you write these formulas without any parentheses, then what >>>> the interpreter will actually execute is: >>>> (a * b) * c >>>> (a - b) - c >>>> (a / b) / c >>>> but >>>> a ** (b ** c) >>>> >>>> Okay, that's the background. Here's the question. We need to decide on >>>> precedence and associativity for '@'. In particular, there are three >>>> different options that are interesting: >>>> >>>> OPTION 1 FOR @: >>>> Precedence: same as * >>>> Associativity: left >>>> My shorthand name for it: "same-left" (yes, very creative) >>>> >>>> This means that if you don't use parentheses, you get: >>>> a @ b @ c -> (a @ b) @ c >>>> a * b @ c -> (a * b) @ c >>>> a @ b * c -> (a @ b) * c >>>> >>>> OPTION 2 FOR @: >>>> Precedence: more-weakly-binding than * >>>> Associativity: right >>>> My shorthand name for it: "weak-right" >>>> >>>> This means that if you don't use parentheses, you get: >>>> a @ b @ c -> a @ (b @ c) >>>> a * b @ c -> (a * b) @ c >>>> a @ b * c -> a @ (b * c) >>>> >>>> OPTION 3 FOR @: >>>> Precedence: more-tightly-binding than * >>>> Associativity: right >>>> My shorthand name for it: "tight-right" >>>> >>>> This means that if you don't use parentheses, you get: >>>> a @ b @ c -> a @ (b @ c) >>>> a * b @ c -> a * (b @ c) >>>> a @ b * c -> (a @ b) * c >>>> >>>> We need to pick which of which options we think is best, based on >>>> whatever reasons we can think of, ideally more than "hmm, weak-right gives >>>> me warm fuzzy feelings" ;-). (In principle the other 2 possible options are >>>> tight-left and weak-left, but there doesn't seem to be any argument in >>>> favor of either, so we'll leave them out of the discussion.) >>>> >>>> Some things to consider: >>>> >>>> * and @ are actually not associative (in the math sense) with respect >>>> to each other, i.e., (a * b) @ c and a * (b @ c) in general give different >>>> results when 'a' is not a scalar. So considering the two expressions 'a * b >>>> @ c' and 'a @ b * c', we can see that each of these three options gives >>>> produces different results in some cases. >>>> >>>> "Same-left" is the easiest to explain and remember, because it's just, >>>> "@ acts like * and /". So we already have to know the rule in order to >>>> understand other non-associative expressions like a / b / c or a - b - c, >>>> and it'd be nice if the same rule applied to things like a * b @ c so we >>>> only had to memorize *one* rule. (Of course there's ** which uses the >>>> opposite rule, but I guess everyone internalized that one in secondary >>>> school; that's not true for * versus @.) This is definitely the default we >>>> should choose unless we have a good reason to do otherwise. >>>> >>>> BUT: there might indeed be a good reason to do otherwise, which is the >>>> whole reason this has come up. Consider: >>>> Mat1 @ Mat2 @ vec >>>> Obviously this will execute much more quickly if we do >>>> Mat1 @ (Mat2 @ vec) >>>> because that results in two cheap matrix-vector multiplies, while >>>> (Mat1 @ Mat2) @ vec >>>> starts out by doing an expensive matrix-matrix multiply. So: maybe @ >>>> should be right associative, so that we get the fast behaviour without >>>> having to use explicit parentheses! /If/ these kinds of expressions are >>>> common enough that having to remember to put explicit parentheses in all >>>> the time is more of a programmer burden than having to memorize a special >>>> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec >>>> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in >>>> practice -- I don't know. >>>> >>>> Also, if we do want @ to be right associative, then I can't think of >>>> any clever reasons to prefer weak-right over tight-right, or vice-versa. >>>> For the scalar multiplication case, I believe both options produce the same >>>> result in the same amount of time. For the non-scalar case, they give >>>> different answers. Do people have strong intuitions about what expressions >>>> like >>>> a * b @ c >>>> a @ b * c >>>> should do actually? (I'm guessing not, but hey, you never know.) >>>> >>>> And, while intuition is useful, it would be really *really* nice to be >>>> basing these decisions on more than *just* intuition, since whatever we >>>> decide will be subtly influencing the experience of writing linear algebra >>>> code in Python for the rest of time. So here's where I could use some help. >>>> First, of course, if you have any other reasons why one or the other of >>>> these options is better, then please share! But second, I think we need to >>>> know something about how often the Mat @ Mat @ vec type cases arise in >>>> practice. How often do non-scalar * and np.dot show up in the same >>>> expression? How often does it look like a * np.dot(b, c), and how often >>>> does it look like np.dot(a * b, c)? How often do we see expressions like >>>> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, >>>> np.dot(b, c))? This would really help guide the debate. I don't have this >>>> data, and I'm not sure the best way to get it. A super-fancy approach would >>>> be to write a little script that uses the 'ast' module to count things >>>> automatically. A less fancy approach would be to just pick some code you've >>>> written, or a well-known package, grep through for calls to 'dot', and make >>>> notes on what you see. (An advantage of the less-fancy approach is that as >>>> a human you might be able to tell the difference between scalar and >>>> non-scalar *, or check whether it actually matters what order the 'dot' >>>> calls are done in.) >>>> >>>> -n >>>> >>>> -- >>>> Nathaniel J. Smith >>>> Postdoctoral researcher - Informatics - University of Edinburgh >>>> http://vorpus.org >>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> I'm in favor of same-left because it's the easiest to remember. >>> with scalar factors it is how I read formulas. >>> >> >> Note that if there are no (interior) vectors involved then the two >> methods of association give theoretically identical results. But when there >> is a vector on the right and no vector on the left, then right association >> is more efficient and likely more numerically accurate. >> > > What's so special about a vector on the right? What if I have a vector on > the left, or, as is pretty common, a quadratic form? > A vector on the right is a fairly common pattern. If one were to use parenthesis as you did in your example to gain efficiency, one would write 'A@(B@(C at v))' as all the multiplications are then matrix -- vector, which is computationally cheaper than matrix -- matrix. When the '@' operator is right associative the parenthesis don't need to be used to get the same result. Of course, for the less common pattern of a single (row) vector on the left, left associativity would be preferred. For vectors on both ends it doesn't matter. So the choice is driven simply by which pattern is the most common. Function composition works the same way: g at f@h(x) = g(f(h(x))). That is, the traditional notation goes right to left. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Sun Mar 16 09:07:45 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Sun, 16 Mar 2014 09:07:45 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: <532451E2.5000205@gmail.com> Message-ID: <5325A221.20906@gmail.com> On 3/15/2014 10:12 PM, Nathaniel Smith wrote: > So to be clear, even if numpy.matrix is going away, and even if > ndarray isn't getting a .I attribute, then you're just as happy > typing/teaching inv(X) as X @@ -1? Yes, that is correct. I am somewhat more unhappy with having to use npla.matrix_power(M,n) instead of M@@n in other teaching settings (e.g., graph theory and recurrence relations). I am certainly not objecting to making `@@` available. It just seems much less important than getting `@` asap. Thanks, Alan Isaac From hoogendoorn.eelco at gmail.com Sun Mar 16 10:39:25 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Sun, 16 Mar 2014 15:39:25 +0100 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: Note that I am not opposed to extra operators in python, and only mildly opposed to a matrix multiplication operator in numpy; but let me lay out the case against, for your consideration. First of all, the use of matrix semantics relative to arrays semantics is extremely rare; even in linear algebra heavy code, arrays semantics often dominate. As such, the default of array semantics for numpy has been a great choice. Ive never looked back at MATLAB semantics. Secondly, I feel the urge to conform to a historical mathematical notation is misguided, especially for the problem domain of linear algebra. Perhaps in the world of mathematics your operation is associative or commutes, but on your computer, the order of operations will influence both outcomes and performance. Even for products, we usually care not only about the outcome, but also how that outcome is arrived at. And along the same lines, I don't suppose I need to explain how I feel about A@@-1 and the likes. Sure, it isn't to hard to learn or infer this implies a matrix inverse, but why on earth would I want to pretend the rich complexity of numerical matrix inversion can be mangled into one symbol? Id much rather write inv or pinv, or whatever particular algorithm happens to be called for given the situation. Considering this isn't the num-lisp discussion group, I suppose I am hardly the only one who feels so. On the whole, I feel the @ operator is mostly superfluous. I prefer to be explicit about where I place my brackets. I prefer to be explicit about the data layout and axes that go into a (multi)linear product, rather than rely on obtuse row/column conventions which are not transparent across function calls. When I do linear algebra, it is almost always vectorized over additional axes; how does a special operator which is only well defined for a few special cases of 2d and 1d tensors help me with that? On the whole, the linear algebra conventions inspired by the particular constraints of people working with blackboards, are a rather ugly and hacky beast in my opinion, which I feel no inclination to emulate. As a sidenote to the contrary; I love using broadcasting semantics when writing papers. Sure, your reviewers will balk at it, but it wouldn't do to give the dinosaurs the last word on what any given formal language ought to be like. We get to define the future, and im not sure the set of conventions that goes under the name of 'matrix multiplication' is one of particular importance to the future of numerical linear algebra. Note that I don't think there is much harm in an @ operator; but I don't see myself using it either. Aside from making textbook examples like a gram-schmidt orthogonalization more compact to write, I don't see it having much of an impact in the real world. On Sat, Mar 15, 2014 at 3:52 PM, Charles R Harris wrote: > > > > On Fri, Mar 14, 2014 at 6:51 PM, Nathaniel Smith wrote: > >> Well, that was fast. Guido says he'll accept the addition of '@' as an >> infix operator for matrix multiplication, once some details are ironed >> out: >> https://mail.python.org/pipermail/python-ideas/2014-March/027109.html >> http://legacy.python.org/dev/peps/pep-0465/ >> >> Specifically, we need to figure out whether we want to make an >> argument for a matrix power operator ("@@"), and what >> precedence/associativity we want '@' to have. I'll post two separate >> threads to get feedback on those in an organized way -- this is just a >> heads-up. >> >> > Surprisingly little discussion on python-ideas, or so it seemed to me. > Guido came out in favor less than halfway through. Congratulations on > putting together a successful proposal, many of us had given up on ever > seeing a matrix multiplication operator. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From scopatz at gmail.com Sun Mar 16 10:41:51 2014 From: scopatz at gmail.com (Anthony Scopatz) Date: Sun, 16 Mar 2014 09:41:51 -0500 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: This is awesome! Congrats! On Sun, Mar 16, 2014 at 9:39 AM, Eelco Hoogendoorn < hoogendoorn.eelco at gmail.com> wrote: > Note that I am not opposed to extra operators in python, and only mildly > opposed to a matrix multiplication operator in numpy; but let me lay out > the case against, for your consideration. > > First of all, the use of matrix semantics relative to arrays semantics is > extremely rare; even in linear algebra heavy code, arrays semantics often > dominate. As such, the default of array semantics for numpy has been a > great choice. Ive never looked back at MATLAB semantics. > > Secondly, I feel the urge to conform to a historical mathematical notation > is misguided, especially for the problem domain of linear algebra. Perhaps > in the world of mathematics your operation is associative or commutes, but > on your computer, the order of operations will influence both outcomes and > performance. Even for products, we usually care not only about the outcome, > but also how that outcome is arrived at. And along the same lines, I don't > suppose I need to explain how I feel about A@@-1 and the likes. Sure, it > isn't to hard to learn or infer this implies a matrix inverse, but why on > earth would I want to pretend the rich complexity of numerical matrix > inversion can be mangled into one symbol? Id much rather write inv or pinv, > or whatever particular algorithm happens to be called for given the > situation. Considering this isn't the num-lisp discussion group, I suppose > I am hardly the only one who feels so. > > On the whole, I feel the @ operator is mostly superfluous. I prefer to be > explicit about where I place my brackets. I prefer to be explicit about the > data layout and axes that go into a (multi)linear product, rather than rely > on obtuse row/column conventions which are not transparent across function > calls. When I do linear algebra, it is almost always vectorized over > additional axes; how does a special operator which is only well defined for > a few special cases of 2d and 1d tensors help me with that? On the > whole, the linear algebra conventions inspired by the particular > constraints of people working with blackboards, are a rather ugly and hacky > beast in my opinion, which I feel no inclination to emulate. As a sidenote > to the contrary; I love using broadcasting semantics when writing papers. > Sure, your reviewers will balk at it, but it wouldn't do to give the > dinosaurs the last word on what any given formal language ought to be like. > We get to define the future, and im not sure the set of conventions that > goes under the name of 'matrix multiplication' is one of particular > importance to the future of numerical linear algebra. > > Note that I don't think there is much harm in an @ operator; but I don't > see myself using it either. Aside from making textbook examples like a > gram-schmidt orthogonalization more compact to write, I don't see it having > much of an impact in the real world. > > > On Sat, Mar 15, 2014 at 3:52 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> >> On Fri, Mar 14, 2014 at 6:51 PM, Nathaniel Smith wrote: >> >>> Well, that was fast. Guido says he'll accept the addition of '@' as an >>> infix operator for matrix multiplication, once some details are ironed >>> out: >>> https://mail.python.org/pipermail/python-ideas/2014-March/027109.html >>> http://legacy.python.org/dev/peps/pep-0465/ >>> >>> Specifically, we need to figure out whether we want to make an >>> argument for a matrix power operator ("@@"), and what >>> precedence/associativity we want '@' to have. I'll post two separate >>> threads to get feedback on those in an organized way -- this is just a >>> heads-up. >>> >>> >> Surprisingly little discussion on python-ideas, or so it seemed to me. >> Guido came out in favor less than halfway through. Congratulations on >> putting together a successful proposal, many of us had given up on ever >> seeing a matrix multiplication operator. >> >> Chuck >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Sun Mar 16 10:49:13 2014 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Sun, 16 Mar 2014 15:49:13 +0100 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: <1394981353.14814.7.camel@laptop-101> Le samedi 15 mars 2014 ? 04:32 +0000, Nathaniel Smith a ?crit : > Hi all, > > Here's the second thread for discussion about Guido's concerns about > PEP 465. The issue here is that PEP 465 as currently written proposes > two new operators, @ for matrix multiplication and @@ for matrix power > (analogous to * and **): > http://legacy.python.org/dev/peps/pep-0465/ Another usecase may rely on tensor contraction. Matrix multiplication appears to be a particular case of tensor contraction for matrix seen as 2nd-order tensor : (A @ B)_{ij} = A_{ik} B_{kj} using Einstein summation notation. @@ might also be used for double contraction as frequently used in continuum mechanics. For example, the relation between strain and stress (2nd order tensors) involves the elasticity tensor (a 4nd order one) using the double contraction : S_{ij} = C_{ijkl}E_{kl} that might be simply calculated with S = C @@ E, the variables S, E, C being instances of whatever class representing tensors. My two cents From njs at pobox.com Sun Mar 16 10:54:42 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Mar 2014 14:54:42 +0000 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Sun, Mar 16, 2014 at 2:39 PM, Eelco Hoogendoorn wrote: > Note that I am not opposed to extra operators in python, and only mildly > opposed to a matrix multiplication operator in numpy; but let me lay out the > case against, for your consideration. > > First of all, the use of matrix semantics relative to arrays semantics is > extremely rare; even in linear algebra heavy code, arrays semantics often > dominate. As such, the default of array semantics for numpy has been a great > choice. Ive never looked back at MATLAB semantics. Different people work on different code and have different experiences here -- yours may or may be typical yours. Pauli did some quick checks on scikit-learn & nipy & scipy, and found that in their test suites, uses of np.dot and uses of elementwise-multiplication are ~equally common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h > Secondly, I feel the urge to conform to a historical mathematical notation > is misguided, especially for the problem domain of linear algebra. Perhaps > in the world of mathematics your operation is associative or commutes, but > on your computer, the order of operations will influence both outcomes and > performance. Even for products, we usually care not only about the outcome, > but also how that outcome is arrived at. And along the same lines, I don't > suppose I need to explain how I feel about A@@-1 and the likes. Sure, it > isn't to hard to learn or infer this implies a matrix inverse, but why on > earth would I want to pretend the rich complexity of numerical matrix > inversion can be mangled into one symbol? Id much rather write inv or pinv, > or whatever particular algorithm happens to be called for given the > situation. Considering this isn't the num-lisp discussion group, I suppose I > am hardly the only one who feels so. > My impression from the other thread is that @@ probably won't end up existing, so you're safe here ;-). > On the whole, I feel the @ operator is mostly superfluous. I prefer to be > explicit about where I place my brackets. I prefer to be explicit about the > data layout and axes that go into a (multi)linear product, rather than rely > on obtuse row/column conventions which are not transparent across function > calls. When I do linear algebra, it is almost always vectorized over > additional axes; how does a special operator which is only well defined for > a few special cases of 2d and 1d tensors help me with that? Einstein notation is coming up on its 100th birthday and is just as blackboard-friendly as matrix product notation. Yet there's still a huge number of domains where the matrix notation dominates. It's cool if you aren't one of the people who find it useful, but I don't think it's going anywhere soon. > Note that I don't think there is much harm in an @ operator; but I don't see > myself using it either. Aside from making textbook examples like a > gram-schmidt orthogonalization more compact to write, I don't see it having > much of an impact in the real world. The analysis in the PEP found ~780 calls to np.dot, just in the two projects I happened to look at. @ will get tons of use in the real world. Maybe all those people who will be using it would be happier if they were using einsum instead, I dunno, but it's an argument you'll have to convince them of, not me :-). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From josef.pktd at gmail.com Sun Mar 16 11:13:15 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 16 Mar 2014 11:13:15 -0400 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Sun, Mar 16, 2014 at 10:54 AM, Nathaniel Smith wrote: > On Sun, Mar 16, 2014 at 2:39 PM, Eelco Hoogendoorn > wrote: > > Note that I am not opposed to extra operators in python, and only mildly > > opposed to a matrix multiplication operator in numpy; but let me lay out > the > > case against, for your consideration. > > > > First of all, the use of matrix semantics relative to arrays semantics is > > extremely rare; even in linear algebra heavy code, arrays semantics often > > dominate. As such, the default of array semantics for numpy has been a > great > > choice. Ive never looked back at MATLAB semantics. > > Different people work on different code and have different experiences > here -- yours may or may be typical yours. Pauli did some quick checks > on scikit-learn & nipy & scipy, and found that in their test suites, > uses of np.dot and uses of elementwise-multiplication are ~equally > common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h > > > Secondly, I feel the urge to conform to a historical mathematical > notation > > is misguided, especially for the problem domain of linear algebra. > Perhaps > > in the world of mathematics your operation is associative or commutes, > but > > on your computer, the order of operations will influence both outcomes > and > > performance. Even for products, we usually care not only about the > outcome, > > but also how that outcome is arrived at. And along the same lines, I > don't > > suppose I need to explain how I feel about A@@-1 and the likes. Sure, it > > isn't to hard to learn or infer this implies a matrix inverse, but why on > > earth would I want to pretend the rich complexity of numerical matrix > > inversion can be mangled into one symbol? Id much rather write inv or > pinv, > > or whatever particular algorithm happens to be called for given the > > situation. Considering this isn't the num-lisp discussion group, I > suppose I > > am hardly the only one who feels so. > > > > My impression from the other thread is that @@ probably won't end up > existing, so you're safe here ;-). > > > On the whole, I feel the @ operator is mostly superfluous. I prefer to be > > explicit about where I place my brackets. I prefer to be explicit about > the > > data layout and axes that go into a (multi)linear product, rather than > rely > > on obtuse row/column conventions which are not transparent across > function > > calls. When I do linear algebra, it is almost always vectorized over > > additional axes; how does a special operator which is only well defined > for > > a few special cases of 2d and 1d tensors help me with that? > > Einstein notation is coming up on its 100th birthday and is just as > blackboard-friendly as matrix product notation. Yet there's still a > huge number of domains where the matrix notation dominates. It's cool > if you aren't one of the people who find it useful, but I don't think > it's going anywhere soon. > > > Note that I don't think there is much harm in an @ operator; but I don't > see > > myself using it either. Aside from making textbook examples like a > > gram-schmidt orthogonalization more compact to write, I don't see it > having > > much of an impact in the real world. > > The analysis in the PEP found ~780 calls to np.dot, just in the two > projects I happened to look at. @ will get tons of use in the real > world. Maybe all those people who will be using it would be happier if > they were using einsum instead, I dunno, but it's an argument you'll > have to convince them of, not me :-). > Just as example I just read for the first time two journal articles in econometrics that use einsum notation. I have no idea what their formulas are supposed to mean, no sum signs and no matrix algebra. I need to have a strong incentive to stare at those formulas again. (statsmodels search finds 1520 "dot", including sandbox and examples) Josef > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Sun Mar 16 12:20:25 2014 From: ewm at redtetrahedron.org (Eric Moore) Date: Sun, 16 Mar 2014 12:20:25 -0400 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Sunday, March 16, 2014, wrote: > > > > On Sun, Mar 16, 2014 at 10:54 AM, Nathaniel Smith > > wrote: > >> On Sun, Mar 16, 2014 at 2:39 PM, Eelco Hoogendoorn >> > >> wrote: >> > Note that I am not opposed to extra operators in python, and only mildly >> > opposed to a matrix multiplication operator in numpy; but let me lay >> out the >> > case against, for your consideration. >> > >> > First of all, the use of matrix semantics relative to arrays semantics >> is >> > extremely rare; even in linear algebra heavy code, arrays semantics >> often >> > dominate. As such, the default of array semantics for numpy has been a >> great >> > choice. Ive never looked back at MATLAB semantics. >> >> Different people work on different code and have different experiences >> here -- yours may or may be typical yours. Pauli did some quick checks >> on scikit-learn & nipy & scipy, and found that in their test suites, >> uses of np.dot and uses of elementwise-multiplication are ~equally >> common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h >> >> > Secondly, I feel the urge to conform to a historical mathematical >> notation >> > is misguided, especially for the problem domain of linear algebra. >> Perhaps >> > in the world of mathematics your operation is associative or commutes, >> but >> > on your computer, the order of operations will influence both outcomes >> and >> > performance. Even for products, we usually care not only about the >> outcome, >> > but also how that outcome is arrived at. And along the same lines, I >> don't >> > suppose I need to explain how I feel about A@@-1 and the likes. Sure, >> it >> > isn't to hard to learn or infer this implies a matrix inverse, but why >> on >> > earth would I want to pretend the rich complexity of numerical matrix >> > inversion can be mangled into one symbol? Id much rather write inv or >> pinv, >> > or whatever particular algorithm happens to be called for given the >> > situation. Considering this isn't the num-lisp discussion group, I >> suppose I >> > am hardly the only one who feels so. >> > >> >> My impression from the other thread is that @@ probably won't end up >> existing, so you're safe here ;-). >> >> > On the whole, I feel the @ operator is mostly superfluous. I prefer to >> be >> > explicit about where I place my brackets. I prefer to be explicit about >> the >> > data layout and axes that go into a (multi)linear product, rather than >> rely >> > on obtuse row/column conventions which are not transparent across >> function >> > calls. When I do linear algebra, it is almost always vectorized over >> > additional axes; how does a special operator which is only well defined >> for >> > a few special cases of 2d and 1d tensors help me with that? >> >> Einstein notation is coming up on its 100th birthday and is just as >> blackboard-friendly as matrix product notation. Yet there's still a >> huge number of domains where the matrix notation dominates. It's cool >> if you aren't one of the people who find it useful, but I don't think >> it's going anywhere soon. >> >> > Note that I don't think there is much harm in an @ operator; but I >> don't see >> > myself using it either. Aside from making textbook examples like a >> > gram-schmidt orthogonalization more compact to write, I don't see it >> having >> > much of an impact in the real world. >> >> The analysis in the PEP found ~780 calls to np.dot, just in the two >> projects I happened to look at. @ will get tons of use in the real >> world. Maybe all those people who will be using it would be happier if >> they were using einsum instead, I dunno, but it's an argument you'll >> have to convince them of, not me :-). >> > > Just as example > > I just read for the first time two journal articles in econometrics that > use einsum notation. > I have no idea what their formulas are supposed to mean, no sum signs and > no matrix algebra. > I need to have a strong incentive to stare at those formulas again. > > (statsmodels search finds 1520 "dot", including sandbox and examples) > > Josef > > > >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > An important distinction between calling dot or @ is that matrix multiplication is a domain where enormous effort has already been spent on algorithms and building fast, scalable libraries. Yes einsum can call these for some subset of calls but it's also trivial to set up a case where it can't. This is a huge pitfall because it hides this complexity. Matrix-matrix and matrix-vector products are the fundamental operations, generalized multilinear products etc are not. Einsum, despite the brevity that it can provide, is too general to make a basic building block. There isn't a good way to reason about its runtime. Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Sun Mar 16 12:33:31 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Sun, 16 Mar 2014 17:33:31 +0100 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: Different people work on different code and have different experiences here -- yours may or may be typical yours. Pauli did some quick checks on scikit-learn & nipy & scipy, and found that in their test suites, uses of np.dot and uses of elementwise-multiplication are ~equally common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h Yeah; these are examples of linalg-heavy packages. Even there, dot does not dominate. My impression from the other thread is that @@ probably won't end up existing, so you're safe here ;-). I know; my point is that the same objections apply to @, albeit in weaker form. Einstein notation is coming up on its 100th birthday and is just as blackboard-friendly as matrix product notation. Yet there's still a huge number of domains where the matrix notation dominates. It's cool if you aren't one of the people who find it useful, but I don't think it's going anywhere soon. Einstein notation is just as blackboard friendly; but also much more computer-future proof. I am not saying matrix multiplication is going anywhere soon; but as far as I can tell that is all inertia; historical circumstance has not accidentially prepared it well for numerical needs, as far as I can tell. The analysis in the PEP found ~780 calls to np.dot, just in the two projects I happened to look at. @ will get tons of use in the real world. Maybe all those people who will be using it would be happier if they were using einsum instead, I dunno, but it's an argument you'll have to convince them of, not me :-). 780 calls is not tons of use, and these projects are outliers id argue. I just read for the first time two journal articles in econometrics that use einsum notation. I have no idea what their formulas are supposed to mean, no sum signs and no matrix algebra. If they could have been expressed more clearly otherwise, of course this is what they should have done; but could they? b_i = A_ij x_j isnt exactly hard to read, but if it was some form of complicated product, its probably tensor notation was their best bet. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Sun Mar 16 12:37:14 2014 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Sun, 16 Mar 2014 12:37:14 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 45 In-Reply-To: References: Message-ID: <5325D33A.7050501@gmail.com> I would like to see the case made for @. Yes, I know that Guido has accepted the idea, but he has changed his mind before. The PEP seems neutral to retaining both np.matrix and @. Nearly ten years ago, Tim Peters gave us: /There should be one-- and preferably only one --obvious way to do it. / W/e now have: / /C= A * B C becomes an instance of the Matrix class (m, p) When A and B are matrices a matrix of (m, n) and (n, p) respectively. Actually, the rules are a little more general than the above. / The PEP proposes that /C= /A @ B where the types or classes of A, B and C are not clear. We also have A.I for the inverse, for the square matrix) or A.T for the transpose of a matrix. One way is recommended in the Zen of Python, of the two, which is the obvious way? Colin W. / / On 15-Mar-2014 9:25 PM, numpy-discussion-request at scipy.org wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. Re: [help needed] associativity and precedence of '@' > (josef.pktd at gmail.com) > 2. Re: [RFC] should we argue for a matrix power operator, @@? > (josef.pktd at gmail.com) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 15 Mar 2014 21:20:40 -0400 > From: josef.pktd at gmail.com > Subject: Re: [Numpy-discussion] [help needed] associativity and > precedence of '@' > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: > >> Hi all, >> >> Here's the main blocker for adding a matrix multiply operator '@' to >> Python: we need to decide what we think its precedence and associativity >> should be. I'll explain what that means so we're on the same page, and what >> the choices are, and then we can all argue about it. But even better would >> be if we could get some data to guide our decision, and this would be a lot >> easier if some of you all can help; I'll suggest some ways you might be >> able to do that. >> >> So! Precedence and left- versus right-associativity. If you already know >> what these are you can skim down until you see CAPITAL LETTERS. >> >> We all know what precedence is. Code like this: >> a + b * c >> gets evaluated as: >> a + (b * c) >> because * has higher precedence than +. It "binds more tightly", as they >> say. Python's complete precedence able is here: >> http://docs.python.org/3/reference/expressions.html#operator-precedence >> >> Associativity, in the parsing sense, is less well known, though it's just >> as important. It's about deciding how to evaluate code like this: >> a * b * c >> Do we use >> a * (b * c) # * is "right associative" >> or >> (a * b) * c # * is "left associative" >> ? Here all the operators have the same precedence (because, uh... they're >> the same operator), so precedence doesn't help. And mostly we can ignore >> this in day-to-day life, because both versions give the same answer, so who >> cares. But a programming language has to pick one (consider what happens if >> one of those objects has a non-default __mul__ implementation). And of >> course it matters a lot for non-associative operations like >> a - b - c >> or >> a / b / c >> So when figuring out order of evaluations, what you do first is check the >> precedence, and then if you have multiple operators next to each other with >> the same precedence, you check their associativity. Notice that this means >> that if you have different operators that share the same precedence level >> (like + and -, or * and /), then they have to all have the same >> associativity. All else being equal, it's generally considered nice to have >> fewer precedence levels, because these have to be memorized by users. >> >> Right now in Python, every precedence level is left-associative, except >> for '**'. If you write these formulas without any parentheses, then what >> the interpreter will actually execute is: >> (a * b) * c >> (a - b) - c >> (a / b) / c >> but >> a ** (b ** c) >> >> Okay, that's the background. Here's the question. We need to decide on >> precedence and associativity for '@'. In particular, there are three >> different options that are interesting: >> >> OPTION 1 FOR @: >> Precedence: same as * >> Associativity: left >> My shorthand name for it: "same-left" (yes, very creative) >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> (a @ b) @ c >> a * b @ c -> (a * b) @ c >> a @ b * c -> (a @ b) * c >> >> OPTION 2 FOR @: >> Precedence: more-weakly-binding than * >> Associativity: right >> My shorthand name for it: "weak-right" >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> a @ (b @ c) >> a * b @ c -> (a * b) @ c >> a @ b * c -> a @ (b * c) >> >> OPTION 3 FOR @: >> Precedence: more-tightly-binding than * >> Associativity: right >> My shorthand name for it: "tight-right" >> >> This means that if you don't use parentheses, you get: >> a @ b @ c -> a @ (b @ c) >> a * b @ c -> a * (b @ c) >> a @ b * c -> (a @ b) * c >> >> We need to pick which of which options we think is best, based on whatever >> reasons we can think of, ideally more than "hmm, weak-right gives me warm >> fuzzy feelings" ;-). (In principle the other 2 possible options are >> tight-left and weak-left, but there doesn't seem to be any argument in >> favor of either, so we'll leave them out of the discussion.) >> >> Some things to consider: >> >> * and @ are actually not associative (in the math sense) with respect to >> each other, i.e., (a * b) @ c and a * (b @ c) in general give different >> results when 'a' is not a scalar. So considering the two expressions 'a * b >> @ c' and 'a @ b * c', we can see that each of these three options gives >> produces different results in some cases. >> >> "Same-left" is the easiest to explain and remember, because it's just, "@ >> acts like * and /". So we already have to know the rule in order to >> understand other non-associative expressions like a / b / c or a - b - c, >> and it'd be nice if the same rule applied to things like a * b @ c so we >> only had to memorize *one* rule. (Of course there's ** which uses the >> opposite rule, but I guess everyone internalized that one in secondary >> school; that's not true for * versus @.) This is definitely the default we >> should choose unless we have a good reason to do otherwise. >> >> BUT: there might indeed be a good reason to do otherwise, which is the >> whole reason this has come up. Consider: >> Mat1 @ Mat2 @ vec >> Obviously this will execute much more quickly if we do >> Mat1 @ (Mat2 @ vec) >> because that results in two cheap matrix-vector multiplies, while >> (Mat1 @ Mat2) @ vec >> starts out by doing an expensive matrix-matrix multiply. So: maybe @ >> should be right associative, so that we get the fast behaviour without >> having to use explicit parentheses! /If/ these kinds of expressions are >> common enough that having to remember to put explicit parentheses in all >> the time is more of a programmer burden than having to memorize a special >> associativity rule for @. Obviously Mat @ Mat @ vec is more common than vec >> @ Mat @ Mat, but maybe they're both so rare that it doesn't matter in >> practice -- I don't know. >> >> Also, if we do want @ to be right associative, then I can't think of any >> clever reasons to prefer weak-right over tight-right, or vice-versa. For >> the scalar multiplication case, I believe both options produce the same >> result in the same amount of time. For the non-scalar case, they give >> different answers. Do people have strong intuitions about what expressions >> like >> a * b @ c >> a @ b * c >> should do actually? (I'm guessing not, but hey, you never know.) >> >> And, while intuition is useful, it would be really *really* nice to be >> basing these decisions on more than *just* intuition, since whatever we >> decide will be subtly influencing the experience of writing linear algebra >> code in Python for the rest of time. So here's where I could use some help. >> First, of course, if you have any other reasons why one or the other of >> these options is better, then please share! But second, I think we need to >> know something about how often the Mat @ Mat @ vec type cases arise in >> practice. How often do non-scalar * and np.dot show up in the same >> expression? How often does it look like a * np.dot(b, c), and how often >> does it look like np.dot(a * b, c)? How often do we see expressions like >> np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, >> np.dot(b, c))? This would really help guide the debate. I don't have this >> data, and I'm not sure the best way to get it. A super-fancy approach would >> be to write a little script that uses the 'ast' module to count things >> automatically. A less fancy approach would be to just pick some code you've >> written, or a well-known package, grep through for calls to 'dot', and make >> notes on what you see. (An advantage of the less-fancy approach is that as >> a human you might be able to tell the difference between scalar and >> non-scalar *, or check whether it actually matters what order the 'dot' >> calls are done in.) >> >> -n >> >> -- >> Nathaniel J. Smith >> Postdoctoral researcher - Informatics - University of Edinburgh >> http://vorpus.org >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > I'm in favor of same-left because it's the easiest to remember. > with scalar factors it is how I read formulas. > > Both calculating dot @ first or calculating elementwise * first sound > logical, but I wouldn't know which should go first. (My "feeling" would be > @ first.) > > > two cases I remembered in statsmodels > H = np.dot(results.model.pinv_wexog, scale[:,None] * > results.model.pinv_wexog.T) > se = (exog * np.dot(covb, exog.T).T).sum(1) > > we are mixing * and dot pretty freely in all combinations AFAIR > > my guess is that I wouldn't trust any sequence without parenthesis for a > long time. > (and I don't trust a sequence of dots @ without parenthesis either, in our > applications.) > > x @ (W.T @ W) @ x ( W.shape = (10000, 5) ) > or > x * (W.T @ W) * x > > (w * x) @ x weighted sum of squares > > Josef > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/d4126289/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Sat, 15 Mar 2014 21:31:22 -0400 > From: josef.pktd at gmail.com > Subject: Re: [Numpy-discussion] [RFC] should we argue for a matrix > power operator, @@? > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > On Sat, Mar 15, 2014 at 8:47 PM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> On Sat, Mar 15, 2014 at 8:38 PM, wrote: >> >>> I think I wouldn't use anything like @@ often enough to remember it's >>> meaning. I'd rather see english names for anything that is not **very** >>> common. >>> >>> I find A@@-1 pretty ugly compared to inv(A) >>> A@@(-0.5) might be nice (do we have matrix_sqrt ?) >>> >> >> scipy.linalg.sqrtm: >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html >> > maybe a good example: I could never figured that one out > > M = sqrtm(A) > > A = M @ M > > but what we use in stats is > > A = R.T @ R > (eigenvectors dot diag(sqrt of eigenvalues) > > which sqrt is A@@(0.5) ? > > Josef > > > >> >> Warren >> >> >> >>> Josef >>> >>> >>> >>> On Sat, Mar 15, 2014 at 5:11 PM, Stephan Hoyer wrote: >>> >>>> Speaking only for myself (and as someone who has regularly used matrix >>>> powers), I would not expect matrix power as @@ to follow from matrix >>>> multiplication as @. I do agree that matrix power is the only reasonable >>>> use for @@ (given @), but it's still not something I would be confident >>>> enough to know without looking up. >>>> >>>> We should keep in mind that each new operator imposes some (small) >>>> cognitive burden on everyone who encounters them for the first time, and, >>>> in this case, this will include a large fraction of all Python users, >>>> whether they do numerical computation or not. >>>> >>>> Guido has given us a tremendous gift in the form of @. Let's not insist >>>> on @@, when it is unclear if the burden of figuring out what @@ means it >>>> would be worth using, even for heavily numeric code. I would certainly >>>> prefer to encounter norm(A), inv(A), matrix_power(A, n), >>>> fractional_matrix_power(A, n) and expm(A) rather than their infix >>>> equivalents. It will certainly not be obvious which of these @@ will >>>> support for objects from any given library. >>>> >>>> One useful data point might be to consider whether matrix power is >>>> available as an infix operator in other languages commonly used for >>>> numerical work. AFAICT from some quick searches: >>>> MATLAB: Yes >>>> R: No >>>> IDL: No >>>> >>>> All of these languages do, of course, implement infix matrix >>>> multiplication, but it is apparently not clear at all whether the matrix >>>> power is useful. >>>> >>>> Best, >>>> Stephan >>>> >>>> >>>> >>>> >>>> On Sat, Mar 15, 2014 at 9:03 AM, Olivier Delalleau wrote: >>>> >>>>> 2014-03-15 11:18 GMT-04:00 Charles R Harris >>>>> : >>>>> >>>>> >>>>>> >>>>>> On Fri, Mar 14, 2014 at 10:32 PM, Nathaniel Smith wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Here's the second thread for discussion about Guido's concerns about >>>>>>> PEP 465. The issue here is that PEP 465 as currently written proposes >>>>>>> two new operators, @ for matrix multiplication and @@ for matrix power >>>>>>> (analogous to * and **): >>>>>>> http://legacy.python.org/dev/peps/pep-0465/ >>>>>>> >>>>>>> The main thing we care about of course is @; I pushed for including @@ >>>>>>> because I thought it was nicer to have than not, and I thought the >>>>>>> analogy between * and ** might make the overall package more appealing >>>>>>> to Guido's aesthetic sense. >>>>>>> >>>>>>> It turns out I was wrong :-). Guido is -0 on @@, but willing to be >>>>>>> swayed if we think it's worth the trouble to make a solid case. >>>>>>> >>>>>>> Note that question now is *not*, how will @@ affect the reception of >>>>>>> @. @ itself is AFAICT a done deal, regardless of what happens with @@. >>>>>>> For this discussion let's assume @ can be taken for granted, and that >>>>>>> we can freely choose to either add @@ or not add @@ to the language. >>>>>>> The question is: which do we think makes Python a better language (for >>>>>>> us and in general)? >>>>>>> >>>>>>> Some thoughts to start us off: >>>>>>> >>>>>>> Here are the interesting use cases for @@ that I can think of: >>>>>>> - 'vector @@ 2' gives the squared Euclidean length (because it's the >>>>>>> same as vector @ vector). Kind of handy. >>>>>>> - 'matrix @@ n' of course gives the matrix power, which is of marginal >>>>>>> use but does come in handy sometimes, e.g., when looking at graph >>>>>>> connectivity. >>>>>>> - 'matrix @@ -1' provides a very transparent notation for translating >>>>>>> textbook formulas (with all their inverses) into code. It's a bit >>>>>>> unhelpful in practice, because (a) usually you should use solve(), and >>>>>>> (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But >>>>>>> sometimes transparent notation may be important. (And in some cases, >>>>>>> like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be >>>>>>> compiled into a call to solve() anyway.) >>>>>>> >>>>>>> (Did I miss any?) >>>>>>> >>>>>>> In practice it seems to me that the last use case is the one that's >>>>>>> might matter a lot practice, but then again, it might not -- I'm not >>>>>>> sure. For example, does anyone who teaches programming with numpy have >>>>>>> a feeling about whether the existence of '@@ -1' would make a big >>>>>>> difference to you and your students? (Alan? I know you were worried >>>>>>> about losing the .I attribute on matrices if switching to ndarrays for >>>>>>> teaching -- given that ndarray will probably not get a .I attribute, >>>>>>> how much would the existence of @@ -1 affect you?) >>>>>>> >>>>>>> On a more technical level, Guido is worried about how @@'s precedence >>>>>>> should work (and this is somewhat related to the other thread about >>>>>>> @'s precedence and associativity, because he feels that if we end up >>>>>>> giving @ and * different precedence, then that makes it much less >>>>>>> clear what to do with @@, and reduces the strength of the */**/@/@@ >>>>>>> analogy). In particular, if we want to argue for @@ then we'll need to >>>>>>> figure out what expressions like >>>>>>> a @@ b @@ c >>>>>>> and >>>>>>> a ** b @@ c >>>>>>> and >>>>>>> a @@ b ** c >>>>>>> should do. >>>>>>> >>>>>>> A related question is what @@ should do if given an array as its right >>>>>>> argument. In the current PEP, only integers are accepted, which rules >>>>>>> out a bunch of the more complicated cases like a @@ b @@ c (at least >>>>>>> assuming @@ is right-associative, like **, and I can't see why you'd >>>>>>> want anything else). OTOH, in the brave new gufunc world, it >>>>>>> technically would make sense to define @@ as being a gufunc with >>>>>>> signature (m,m),()->(m,m), and the way gufuncs work this *would* allow >>>>>>> the "power" to be an array -- for example, we'd have: >>>>>>> >>>>>>> mat = randn(m, m) >>>>>>> pow = range(n) >>>>>>> result = gufunc_matrix_power(mat, pow) >>>>>>> assert result.shape == (n, m, m) >>>>>>> for i in xrange(n): >>>>>>> assert np.all(result[i, :, :] == mat ** i) >>>>>>> >>>>>>> In this case, a @@ b @@ c would at least be a meaningful expression to >>>>>>> write. OTOH it would be incredibly bizarre and useless, so probably >>>>>>> no-one would ever write it. >>>>>>> >>>>>>> As far as these technical issues go, my guess is that the correct rule >>>>>>> is that @@ should just have the same precedence and the same (right) >>>>>>> associativity as **, and in practice no-one will ever write stuff like >>>>>>> a @@ b @@ c. But if we want to argue for @@ we need to come to some >>>>>>> consensus or another here. >>>>>>> >>>>>>> It's also possible the answer is "ugh, these issues are too >>>>>>> complicated, we should defer this until later when we have more >>>>>>> experience with @ and gufuncs and stuff". After all, I doubt anyone >>>>>>> else will swoop in and steal @@ to mean something else! OTOH, if e.g. >>>>>>> there's a strong feeling that '@@ -1' will make a big difference in >>>>>>> pedagogical contexts, then putting that off for years might be a >>>>>>> mistake. >>>>>>> >>>>>>> >>>>>> I don't have a strong feeling either way on '@@' . Matrix inverses are >>>>>> pretty common in matrix expressions, but I don't know that the new operator >>>>>> offers much advantage over a function call. The positive integer powers >>>>>> might be useful in some domains, as others have pointed out, but >>>>>> computational practice one would tend to factor the evaluation. >>>>>> >>>>>> Chuck >>>>>> >>>>> Personally I think it should go in, because: >>>>> - it's useful (although marginally), as in the examples previously >>>>> mentioned >>>>> - it's what people will expect >>>>> - it's the only reasonable use of @@ once @ makes it in >>>>> >>>>> As far as the details about precedence rules and what not... Yes, >>>>> someone should think about them and come up with rules that make sense, but >>>>> since it will be pretty much only be used in unambiguous situations, this >>>>> shouldn't be a blocker. >>>>> >>>>> -=- Olivier >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140315/ddc1812f/attachment.html > > ------------------------------ > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > End of NumPy-Discussion Digest, Vol 90, Issue 45 > ************************************************ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Mar 16 12:42:58 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Mar 2014 16:42:58 +0000 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: On Sun, Mar 16, 2014 at 4:33 PM, Eelco Hoogendoorn wrote: >> >> Different people work on different code and have different experiences >> here -- yours may or may be typical yours. Pauli did some quick checks >> on scikit-learn & nipy & scipy, and found that in their test suites, >> uses of np.dot and uses of elementwise-multiplication are ~equally >> common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h > > Yeah; these are examples of linalg-heavy packages. Even there, dot does not > dominate. Not sure what makes them "linalg-heavy" -- they're just trying to cover two application areas, machine learning and neuroscience. If that turns out to involve a lot of linear algebra, well, then... > 780 calls is not tons of use, and these projects are outliers id argue. But you haven't argued! You've just asserted. I admittedly didn't spend a lot of time figuring out what the "most representative" projects were, I just picked two high profile ones off the top of my head, but I ran the numbers and they came out the way they did. (I wasn't convinced @ was useful either when I started, I just figured it would be good to settle the infix operator question one way or the other. I was also surprised np.dot turned out to be used that heavily.) If you don't like my data, then show us yours :-). -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From hoogendoorn.eelco at gmail.com Sun Mar 16 12:51:04 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Sun, 16 Mar 2014 17:51:04 +0100 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: > An important distinction between calling dot or @ is that matrix multiplication is a domain where enormous effort has already been spent on > algorithms and building fast, scalable libraries. Yes einsum can call these for some subset of calls but it's also trivial to set up a case where it can't. > This is a huge pitfall because it hides this complexity. > Einsum, despite the brevity that it can provide, is too general to make a basic building block. There isn't a good way to reason about its runtime. I am not arguing in favor of einsum; I am arguing in favor of being explicit, rather than hiding semantically meaningful information from the code. Whether using @ or dot or einsum, you are not explicitly specifying the type of algorithm used, so on that front, its a wash, really. But at least dot and einsum have room for keyword arguments. '@' is in my perception simply too narrow an interface to cram in all meaningful information that you might want to specify concerning a linear product. > Matrix-matrix and matrix-vector products are the fundamental operations, generalized multilinear products etc are not. Perhaps from a library perspective, but from a conceptual perspective, it is very much the other way around. If we keep going in the direction that numba/theano/loopy take, such library functionality will soon be moot. Id argue that the priority of the default semantics should be in providing a unified conceptual scheme, rather than maximum performance considerations. Ideally, the standard operator would pick a sensible default which can be inferred from the arguments, while allowing for explicit specification of the kind of algorithm used where this verbosity is worth the hassle. On Sun, Mar 16, 2014 at 5:33 PM, Eelco Hoogendoorn < hoogendoorn.eelco at gmail.com> wrote: > > >> Different people work on different code and have different experiences >> here -- yours may or may be typical yours. Pauli did some quick checks >> on scikit-learn & nipy & scipy, and found that in their test suites, >> uses of np.dot and uses of elementwise-multiplication are ~equally >> common: https://github.com/numpy/numpy/pull/4351#issuecomment-37717330h >> > > Yeah; these are examples of linalg-heavy packages. Even there, dot does > not dominate. > > > >> My impression from the other thread is that @@ probably won't end up >> existing, so you're safe here ;-). >> > I know; my point is that the same objections apply to @, albeit in weaker > form. > > > >> Einstein notation is coming up on its 100th birthday and is just as >> blackboard-friendly as matrix product notation. Yet there's still a >> huge number of domains where the matrix notation dominates. It's cool >> if you aren't one of the people who find it useful, but I don't think >> it's going anywhere soon. >> > Einstein notation is just as blackboard friendly; but also much more > computer-future proof. I am not saying matrix multiplication is going > anywhere soon; but as far as I can tell that is all inertia; historical > circumstance has not accidentially prepared it well for numerical needs, as > far as I can tell. > > > The analysis in the PEP found ~780 calls to np.dot, just in the two > projects I happened to look at. @ will get tons of use in the real > world. Maybe all those people who will be using it would be happier if > they were using einsum instead, I dunno, but it's an argument you'll > have to convince them of, not me :-). > > 780 calls is not tons of use, and these projects are outliers id argue. > > > I just read for the first time two journal articles in econometrics that > use einsum notation. > > I have no idea what their formulas are supposed to mean, no sum signs > and no matrix algebra. > > If they could have been expressed more clearly otherwise, of course this > is what they should have done; but could they? b_i = A_ij x_j isnt exactly > hard to read, but if it was some form of complicated product, its probably > tensor notation was their best bet. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sun Mar 16 13:10:21 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 16 Mar 2014 17:10:21 +0000 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 45 In-Reply-To: <5325D33A.7050501@gmail.com> References: <5325D33A.7050501@gmail.com> Message-ID: On Sun, Mar 16, 2014 at 4:37 PM, Colin J. Williams wrote: > I would like to see the case made for @. Yes, I know that Guido has > accepted the idea, but he has changed his mind before. I'm not sure how to usefully respond to this, since, I already wrote a ~20 page document making the case for @? Maybe if you think the arguments in it aren't good, it would be more helpful to explain which ones and why? > The PEP seems neutral to retaining both np.matrix and @. I'm not sure what gives you this impression. The main point of the whole first section of the PEP is to explain why the existence of np.matrix causes problems and why a substantial majority of developers hate it, and how adding @ will let us solve these problems. Whether we actually get rid of np.matrix is a more complicated question (we'll need sort of compatibility/transition strategy, it will depend on how quickly python versions with @ support are adopted, etc.), but at the very least the goal is that @ eventually replace it in all new code. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From joseph.martinot-lagarde at m4x.org Sun Mar 16 15:35:52 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Sun, 16 Mar 2014 20:35:52 +0100 Subject: [Numpy-discussion] It looks like Py 3.5 will include a dedicated infix matrix multiply operator In-Reply-To: References: Message-ID: <5325FD18.6080402@m4x.org> Le 16/03/2014 15:39, Eelco Hoogendoorn a ?crit : > Note that I am not opposed to extra operators in python, and only mildly > opposed to a matrix multiplication operator in numpy; but let me lay out > the case against, for your consideration. > > First of all, the use of matrix semantics relative to arrays > semantics is extremely rare; even in linear algebra heavy code, arrays > semantics often dominate. As such, the default of array semantics for > numpy has been a great choice. Ive never looked back at MATLAB semantics. > > Secondly, I feel the urge to conform to a historical mathematical > notation is misguided, especially for the problem domain of linear > algebra. Perhaps in the world of mathematics your operation is > associative or commutes, but on your computer, the order of operations > will influence both outcomes and performance. Even for products, we > usually care not only about the outcome, but also how that outcome is > arrived at. And along the same lines, I don't suppose I need to explain > how I feel about A@@-1 and the likes. Sure, it isn't to hard to learn or > infer this implies a matrix inverse, but why on earth would I want to > pretend the rich complexity of numerical matrix inversion can be mangled > into one symbol? Id much rather write inv or pinv, or whatever > particular algorithm happens to be called for given the situation. > Considering this isn't the num-lisp discussion group, I suppose I am > hardly the only one who feels so. > > On the whole, I feel the @ operator is mostly superfluous. I prefer to > be explicit about where I place my brackets. I prefer to be explicit > about the data layout and axes that go into a (multi)linear product, > rather than rely on obtuse row/column conventions which are not > transparent across function calls. When I do linear algebra, it is > almost always vectorized over additional axes; how does a special > operator which is only well defined for a few special cases of 2d and 1d > tensors help me with that? Well, the PEP explains a well-defined logical interpretation for cases >2d, using broadcasting. You can vectorize over additionnal axes. > On the whole, the linear algebra conventions > inspired by the particular constraints of people working > with blackboards, are a rather ugly and hacky beast in my opinion, which > I feel no inclination to emulate. As a sidenote to the contrary; I love > using broadcasting semantics when writing papers. Sure, your reviewers > will balk at it, but it wouldn't do to give the dinosaurs the last word > on what any given formal language ought to be like. We get to define the > future, and im not sure the set of conventions that goes under the name > of 'matrix multiplication' is one of particular importance to the future > of numerical linear algebra. > > Note that I don't think there is much harm in an @ operator; but I don't > see myself using it either. Aside from making textbook examples like a > gram-schmidt orthogonalization more compact to write, I don't see it > having much of an impact in the real world. > > > On Sat, Mar 15, 2014 at 3:52 PM, Charles R Harris > > wrote: > > > > > On Fri, Mar 14, 2014 at 6:51 PM, Nathaniel Smith > wrote: > > Well, that was fast. Guido says he'll accept the addition of '@' > as an > infix operator for matrix multiplication, once some details are > ironed > out: > https://mail.python.org/pipermail/python-ideas/2014-March/027109.html > http://legacy.python.org/dev/peps/pep-0465/ > > Specifically, we need to figure out whether we want to make an > argument for a matrix power operator ("@@"), and what > precedence/associativity we want '@' to have. I'll post two separate > threads to get feedback on those in an organized way -- this is > just a > heads-up. > > > Surprisingly little discussion on python-ideas, or so it seemed to > me. Guido came out in favor less than halfway through. > Congratulations on putting together a successful proposal, many of > us had given up on ever seeing a matrix multiplication operator. > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > --- Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com From ralf.gommers at gmail.com Sun Mar 16 16:57:41 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 16 Mar 2014 21:57:41 +0100 Subject: [Numpy-discussion] ANN: Scipy 0.14.0 beta 1 release Message-ID: Hi, I'm pleased to announce the availability of the first beta release of Scipy0.14.0. Please try this beta and report any issues on the scipy-dev mailing list. Source tarballs, binaries and the full release notes can be found at http://sourceforge.net/projects/scipy/files/scipy/0.14.0b1/. Part of the release notes copied below. A big thank you to everyone who contributed to this release! Ralf SciPy 0.14.0 is the culmination of 8 months of hard work. It contains many new features, numerous bug-fixes, improved test coverage and better documentation. There have been a number of deprecations and API changes in this release, which are documented below. All users are encouraged to upgrade to this release, as there are a large number of bug-fixes and optimizations. Moreover, our development attention will now shift to bug-fix releases on the 0.14.x branch, and on adding new features on the master branch. This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.5.1 or greater. New features ============ ``scipy.interpolate`` improvements ---------------------------------- A new wrapper function `scipy.interpolate.interpn` for interpolation on regular grids has been added. `interpn` supports linear and nearest-neighbor interpolation in arbitrary dimensions and spline interpolation in two dimensions. Faster implementations of piecewise polynomials in power and Bernstein polynomial bases have been added as `scipy.interpolate.PPoly` and `scipy.interpolate.BPoly`. New users should use these in favor of `scipy.interpolate.PiecewisePolynomial`. `scipy.interpolate.interp1d` now accepts non-monotonic inputs and sorts them. If performance is critical, sorting can be turned off by using the new ``assume_sorted`` keyword. Functionality for evaluation of bivariate spline derivatives in ``scipy.interpolate`` has been added. The new class `scipy.interpolate.Akima1DInterpolator` implements the piecewise cubic polynomial interpolation scheme devised by H. Akima. Functionality for fast interpolation on regular, unevenly spaced grids in arbitrary dimensions has been added as `scipy.interpolate.RegularGridInterpolator` . ``scipy.linalg`` improvements ----------------------------- The new function `scipy.linalg.dft` computes the matrix of the discrete Fourier transform. A condition number estimation function for matrix exponential, `scipy.linalg.expm_cond`, has been added. ``scipy.optimize`` improvements ------------------------------- A set of benchmarks for optimize, which can be run with ``optimize.bench()``, has been added. `scipy.optimize.curve_fit` now has more controllable error estimation via the ``absolute_sigma`` keyword. Support for passing custom minimization methods to ``optimize.minimize()`` and ``optimize.minimize_scalar()`` has been added, currently useful especially for combining ``optimize.basinhopping()`` with custom local optimizer routines. ``scipy.stats`` improvements ---------------------------- A new class `scipy.stats.multivariate_normal` with functionality for multivariate normal random variables has been added. A lot of work on the ``scipy.stats`` distribution framework has been done. Moment calculations (skew and kurtosis mainly) are fixed and verified, all examples are now runnable, and many small accuracy and performance improvements for individual distributions were merged. The new function `scipy.stats.anderson_ksamp` computes the k-sample Anderson-Darling test for the null hypothesis that k samples come from the same parent population. ``scipy.signal`` improvements ----------------------------- ``scipy.signal.iirfilter`` and related functions to design Butterworth, Chebyshev, elliptical and Bessel IIR filters now all use pole-zero ("zpk") format internally instead of using transformations to numerator/denominator format. The accuracy of the produced filters, especially high-order ones, is improved significantly as a result. The new function `scipy.signal.vectorstrength` computes the vector strength, a measure of phase synchrony, of a set of events. ``scipy.special`` improvements ------------------------------ The functions `scipy.special.boxcox` and `scipy.special.boxcox1p`, which compute the Box-Cox transformation, have been added. ``scipy.sparse`` improvements ----------------------------- - Significant performance improvement in CSR, CSC, and DOK indexing speed. - When using Numpy >= 1.9 (to be released in MM 2014), sparse matrices function correctly when given to arguments of ``np.dot``, ``np.multiply`` and other ufuncs. With earlier Numpy and Scipy versions, the results of such operations are undefined and usually unexpected. - Sparse matrices are no longer limited to ``2^31`` nonzero elements. They automatically switch to using 64-bit index data type for matrices containing more elements. User code written assuming the sparse matrices use int32 as the index data type will continue to work, except for such large matrices. Code dealing with larger matrices needs to accept either int32 or int64 indices. Deprecated features =================== ``anneal`` ---------- The global minimization function `scipy.optimize.anneal` is deprecated. All users should use the `scipy.optimize.basinhopping` function instead. ``scipy.stats`` --------------- ``randwcdf`` and ``randwppf`` functions are deprecated. All users should use distribution-specific ``rvs`` methods instead. Probability calculation aliases ``zprob``, ``fprob`` and ``ksprob`` are deprecated. Use instead the ``sf`` methods of the corresponding distributions or the ``special`` functions directly. ``scipy.interpolate`` --------------------- ``PiecewisePolynomial`` class is deprecated. Backwards incompatible changes ============================== scipy.special.lpmn ------------------ ``lpmn`` no longer accepts complex-valued arguments. A new function ``clpmn`` with uniform complex analytic behavior has been added, and it should be used instead. scipy.sparse.linalg ------------------- Eigenvectors in the case of generalized eigenvalue problem are normalized to unit vectors in 2-norm, rather than following the LAPACK normalization convention. The deprecated UMFPACK wrapper in ``scipy.sparse.linalg`` has been removed due to license and install issues. If available, ``scikits.umfpack`` is still used transparently in the ``spsolve`` and ``factorized`` functions. Otherwise, SuperLU is used instead in these functions. scipy.stats ----------- The deprecated functions ``glm``, ``oneway`` and ``cmedian`` have been removed from ``scipy.stats``. ``stats.scoreatpercentile`` now returns an array instead of a list of percentiles. scipy.interpolate ----------------- The API for computing derivatives of a monotone piecewise interpolation has changed: if `p` is a ``PchipInterpolator`` object, `p.derivative(der)` returns a callable object representing the derivative of `p`. For in-place derivatives use the second argument of the `__call__` method: `p(0.1, der=2)` evaluates the second derivative of `p` at `x=0.1`. The method `p.derivatives` has been removed. Authors ======= * Marc Abramowitz + * andbo + * Vincent Arel-Bundock + * Petr Baudis + * Max Bolingbroke * Fran?ois Boulogne * Matthew Brett * Lars Buitinck * Evgeni Burovski * CJ Carey + * Thomas A Caswell + * Pawel Chojnacki + * Phillip Cloud + * Stefano Costa + * David Cournapeau * Dapid + * Matthieu Dartiailh + * Christoph Deil + * J?rg Dietrich + * endolith * Francisco de la Pe?a + * Ben FrantzDale + * Jim Garrison + * Andr? Gaul * Christoph Gohlke * Ralf Gommers * Robert David Grant * Alex Griffing * Blake Griffith * Yaroslav Halchenko * Andreas Hilboll * Kat Huang * Gert-Ludwig Ingold * jamestwebber + * Dorota Jarecka + * Todd Jennings + * Thouis (Ray) Jones * Juan Luis Cano Rodr?guez * ktritz + * Jacques Kvam + * Eric Larson + * Justin Lavoie + * Denis Laxalde * Jussi Leinonen + * lemonlaug + * Tim Leslie * Alain Leufroy + * George Lewis + * Max Linke + * Brandon Liu + * Benny Malengier + * Matthias K?mmerer + * Cimarron Mittelsteadt + * Eric Moore * Andrew Nelson + * Niklas Hamb?chen + * Joel Nothman + * Clemens Novak * Emanuele Olivetti + * Stefan Otte + * peb + * Josef Perktold * pjwerneck * Andrew Sczesnak + * poolio * J?r?me Roy + * Carl Sandrock + * Shauna + * Fabrice Silva * Daniel B. Smith * Patrick Snape + * Thomas Spura + * Jacob Stevenson * Julian Taylor * Tomas Tomecek * Richard Tsai * Joris Vankerschaver + * Pauli Virtanen * Warren Weckesser A total of 78 people contributed to this release. People with a "+" by their names contributed a patch for the first time. This list of names is automatically generated, and may not be fully complete. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Mar 17 06:15:23 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 17 Mar 2014 10:15:23 +0000 (UTC) Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? References: Message-ID: <1960528013416744012.565625sturla.molden-gmail.com@news.gmane.org> Personally I did not like @@ in the first place. Sturla Nathaniel Smith wrote: > Hi all, > > Here's the second thread for discussion about Guido's concerns about > PEP 465. The issue here is that PEP 465 as currently written proposes > two new operators, @ for matrix multiplication and @@ for matrix power > (analogous to * and **): > http://legacy.python.org/dev/peps/pep-0465/ > > The main thing we care about of course is @; I pushed for including @@ > because I thought it was nicer to have than not, and I thought the > analogy between * and ** might make the overall package more appealing > to Guido's aesthetic sense. > > It turns out I was wrong :-). Guido is -0 on @@, but willing to be > swayed if we think it's worth the trouble to make a solid case. > > Note that question now is *not*, how will @@ affect the reception of > @. @ itself is AFAICT a done deal, regardless of what happens with @@. > For this discussion let's assume @ can be taken for granted, and that > we can freely choose to either add @@ or not add @@ to the language. > The question is: which do we think makes Python a better language (for > us and in general)? > > Some thoughts to start us off: > > Here are the interesting use cases for @@ that I can think of: > - 'vector @@ 2' gives the squared Euclidean length (because it's the > same as vector @ vector). Kind of handy. > - 'matrix @@ n' of course gives the matrix power, which is of marginal > use but does come in handy sometimes, e.g., when looking at graph > connectivity. > - 'matrix @@ -1' provides a very transparent notation for translating > textbook formulas (with all their inverses) into code. It's a bit > unhelpful in practice, because (a) usually you should use solve(), and > (b) 'matrix @@ -1' is actually more characters than 'inv(matrix)'. But > sometimes transparent notation may be important. (And in some cases, > like using numba or theano or whatever, 'matrix @@ -1 @ foo' could be > compiled into a call to solve() anyway.) > > (Did I miss any?) > > In practice it seems to me that the last use case is the one that's > might matter a lot practice, but then again, it might not -- I'm not > sure. For example, does anyone who teaches programming with numpy have > a feeling about whether the existence of '@@ -1' would make a big > difference to you and your students? (Alan? I know you were worried > about losing the .I attribute on matrices if switching to ndarrays for > teaching -- given that ndarray will probably not get a .I attribute, > how much would the existence of @@ -1 affect you?) > > On a more technical level, Guido is worried about how @@'s precedence > should work (and this is somewhat related to the other thread about > @'s precedence and associativity, because he feels that if we end up > giving @ and * different precedence, then that makes it much less > clear what to do with @@, and reduces the strength of the */**/@/@@ > analogy). In particular, if we want to argue for @@ then we'll need to > figure out what expressions like > a @@ b @@ c > and > a ** b @@ c > and > a @@ b ** c > should do. > > A related question is what @@ should do if given an array as its right > argument. In the current PEP, only integers are accepted, which rules > out a bunch of the more complicated cases like a @@ b @@ c (at least > assuming @@ is right-associative, like **, and I can't see why you'd > want anything else). OTOH, in the brave new gufunc world, it > technically would make sense to define @@ as being a gufunc with > signature (m,m),()->(m,m), and the way gufuncs work this *would* allow > the "power" to be an array -- for example, we'd have: > > mat = randn(m, m) > pow = range(n) > result = gufunc_matrix_power(mat, pow) > assert result.shape == (n, m, m) > for i in xrange(n): > assert np.all(result[i, :, :] == mat ** i) > > In this case, a @@ b @@ c would at least be a meaningful expression to > write. OTOH it would be incredibly bizarre and useless, so probably > no-one would ever write it. > > As far as these technical issues go, my guess is that the correct rule > is that @@ should just have the same precedence and the same (right) > associativity as **, and in practice no-one will ever write stuff like > a @@ b @@ c. But if we want to argue for @@ we need to come to some > consensus or another here. > > It's also possible the answer is "ugh, these issues are too > complicated, we should defer this until later when we have more > experience with @ and gufuncs and stuff". After all, I doubt anyone > else will swoop in and steal @@ to mean something else! OTOH, if e.g. > there's a strong feeling that '@@ -1' will make a big difference in > pedagogical contexts, then putting that off for years might be a > mistake. > > -n From njs at pobox.com Mon Mar 17 07:53:55 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 11:53:55 +0000 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 4:32 AM, Nathaniel Smith wrote: > For this discussion let's assume @ can be taken for granted, and that > we can freely choose to either add @@ or not add @@ to the language. > The question is: which do we think makes Python a better language (for > us and in general)? The thread so far, it sounds like the consensus answer is "meh, whatever". So I'm thinking we should just drop @@ from the PEP, and if it turns out that this is a problem we can always revisit it in the ~3.6/3.7 timeframe. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From robert.kern at gmail.com Mon Mar 17 07:55:05 2014 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2014 11:55:05 +0000 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 11:53 AM, Nathaniel Smith wrote: > On Sat, Mar 15, 2014 at 4:32 AM, Nathaniel Smith wrote: >> For this discussion let's assume @ can be taken for granted, and that >> we can freely choose to either add @@ or not add @@ to the language. >> The question is: which do we think makes Python a better language (for >> us and in general)? > > The thread so far, it sounds like the consensus answer is "meh, > whatever". So I'm thinking we should just drop @@ from the PEP, and if > it turns out that this is a problem we can always revisit it in the > ~3.6/3.7 timeframe. +1. Thanks! -- Robert Kern From njs at pobox.com Mon Mar 17 11:48:28 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 15:48:28 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 7:01 PM, Alexander Belopolsky wrote: > > On Sat, Mar 15, 2014 at 2:25 PM, Alexander Belopolsky > wrote: >> >> On Fri, Mar 14, 2014 at 11:41 PM, Nathaniel Smith wrote: >>> >>> Here's the main blocker for adding a matrix multiply operator '@' to >>> Python: we need to decide what we think its precedence and associativity >>> should be. >> >> >> I am not ready to form my own opinion, but I hope the following will help >> shaping the discussion. > > > One more question that I think should be answered by the PEP and may > influence the associativity decision is what happens if in an A @ B @ C > expression, each operand has its own type that defines __matmul__ and > __rmatmul__? For example, A can be an ndarray, B a sympy expression and C a > pyoperator. The general rule in Python is that in a binary operation A # B, then first we try A.__special__, and if that doesn't exist or it returns NotImplemented, then we try B.__rspecial__. (The exception is that if B.__class__ is a proper subclass of A.__class__, then we do it in the reverse order.) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndarray at mac.com Mon Mar 17 12:09:32 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 17 Mar 2014 12:09:32 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 11:48 AM, Nathaniel Smith wrote: > > One more question that I think should be answered by the PEP and may > > influence the associativity decision is what happens if in an A @ B @ C > > expression, each operand has its own type that defines __matmul__ and > > __rmatmul__? For example, A can be an ndarray, B a sympy expression and > C a > > pyoperator. > > The general rule in Python is that in a binary operation A # B, then > first we try A.__special__, and if that doesn't exist or it returns > NotImplemented, then we try B.__rspecial__. (The exception is that if > B.__class__ is a proper subclass of A.__class__, then we do it in the > reverse order.) This is the simple case. My question was: "what happens if in an A @ B @ C expression, each operand has its own type that defines __matmul__ and __rmatmul__?" Are we going to recommend that other projects adopt numpy's __array_priority__? In mixed-type expressions, do you expect A @ B @ C to have type of A, B, or C? Does __matmul__ first then __rmatmul__ rule makes sense if @ becomes right-associative or should the order be reversed? -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Mon Mar 17 12:09:58 2014 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 17 Mar 2014 16:09:58 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 3:48 PM, Nathaniel Smith wrote: > On Sat, Mar 15, 2014 at 7:01 PM, Alexander Belopolsky wrote: >> One more question that I think should be answered by the PEP and may >> influence the associativity decision is what happens if in an A @ B @ C >> expression, each operand has its own type that defines __matmul__ and >> __rmatmul__? For example, A can be an ndarray, B a sympy expression and C a >> pyoperator. > > The general rule in Python is that in a binary operation A # B, then > first we try A.__special__, and if that doesn't exist or it returns > NotImplemented, then we try B.__rspecial__. (The exception is that if > B.__class__ is a proper subclass of A.__class__, then we do it in the > reverse order.) Assuming that all combinations are possible and give no error: A @ B @ C == A.__matmul__(B.__matmul__(C)) # right A @ B @ C == A.__matmul__(B).__matmul__(C) # left Did you want to specify which permutations of X.__matmul__(Y) return NotImplemented? -- Robert Kern From njs at pobox.com Mon Mar 17 12:13:25 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 16:13:25 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 4:09 PM, Alexander Belopolsky wrote: > > On Mon, Mar 17, 2014 at 11:48 AM, Nathaniel Smith wrote: >> >> > One more question that I think should be answered by the PEP and may >> > influence the associativity decision is what happens if in an A @ B @ C >> > expression, each operand has its own type that defines __matmul__ and >> > __rmatmul__? For example, A can be an ndarray, B a sympy expression and >> > C a >> > pyoperator. >> >> The general rule in Python is that in a binary operation A # B, then >> first we try A.__special__, and if that doesn't exist or it returns >> NotImplemented, then we try B.__rspecial__. (The exception is that if >> B.__class__ is a proper subclass of A.__class__, then we do it in the >> reverse order.) > > This is the simple case. My question was: "what happens if in an A @ B @ C > expression, each operand has its own type that defines __matmul__ and > __rmatmul__?" The point of associativity is that the complex case A @ B @ C gets turned into either A @ (B @ C) or else (A @ B) @ C, and then you're back in the simple case. > Are we going to recommend that other projects adopt numpy's > __array_priority__? > > In mixed-type expressions, do you expect A @ B @ C to have type of A, B, or > C? > > Does __matmul__ first then __rmatmul__ rule makes sense if @ becomes > right-associative or should the order be reversed? ** is right-associative and uses the left-then-right rule, so it seems fine to me. In general the left-then-right rule has no particular logic behind it, it's just chosen so as to have *some* rule. In practice all well-behaved classes have to make sure that they implement __special__ methods in such a way that all the different variations work, no matter which class ends up actually handling the operation. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndarray at mac.com Mon Mar 17 12:50:05 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 17 Mar 2014 12:50:05 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 12:13 PM, Nathaniel Smith wrote: > In practice all > well-behaved classes have to make sure that they implement __special__ > methods in such a way that all the different variations work, no > matter which class ends up actually handling the operation. > "Well-behaved classes" are hard to come by in practice. The @ operator may fix the situation with np.matrix, so take a look at MaskedArray with its 40-line __array_wrap__ and no end of bugs. Requiring superclass __method__ to handle creation of subclass results correctly is turning Liskov principle on its head. With enough clever tricks and tight control over the full class hierarchy you can make it work in some cases, but it is not a good design. I am afraid that making @ special among other binary operators that implement mathematically associative operations will create a lot of confusion. (The pow operator is special because the corresponding mathematical operation is non-associative.) Imagine teaching someone that a % b % c = (a % b) % c, but a @ b @ c = a @ (b @ c). What are the chances that they will correctly figure out what a // b // c means after this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Mon Mar 17 13:01:46 2014 From: aron at ahmadia.net (Aron Ahmadia) Date: Mon, 17 Mar 2014 13:01:46 -0400 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 7:53 AM, Nathaniel Smith wrote: > The thread so far, it sounds like the consensus answer is "meh, > whatever". So I'm thinking we should just drop @@ from the PEP, and if > it turns out that this is a problem we can always revisit it in the > ~3.6/3.7 timeframe. > +1 from here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Mar 17 13:18:35 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 17 Mar 2014 13:18:35 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 12:50 PM, Alexander Belopolsky wrote: > > On Mon, Mar 17, 2014 at 12:13 PM, Nathaniel Smith wrote: > >> In practice all >> well-behaved classes have to make sure that they implement __special__ >> methods in such a way that all the different variations work, no >> matter which class ends up actually handling the operation. >> > > "Well-behaved classes" are hard to come by in practice. The @ operator > may fix the situation with np.matrix, so take a look at MaskedArray with > its 40-line __array_wrap__ and no end of bugs. > > Requiring superclass __method__ to handle creation of subclass results > correctly is turning Liskov principle on its head. With enough clever > tricks and tight control over the full class hierarchy you can make it work > in some cases, but it is not a good design. > > I am afraid that making @ special among other binary operators that > implement mathematically associative operations will create a lot of > confusion. (The pow operator is special because the corresponding > mathematical operation is non-associative.) > > Imagine teaching someone that a % b % c = (a % b) % c, but a @ b @ c = a @ > (b @ c). What are the chances that they will correctly figure out what a > // b // c means after this? > One case where we need to keep track of left or right is type promotion >>> a.shape (100,) >>> 1. * a.dot(a) -98.0 >>> (1.*a).dot(a) 328350.0 >>> a.dtype dtype('int8') >>> 1. * a @ a ??? similar to >>> 1. * 2 / 3 0.6666666666666666 >>> 1. * (2 / 3) # I'm not in the `future` 0.0 Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > >>> 1. * a.dot(a) -98.0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From fperez.net at gmail.com Mon Mar 17 13:30:49 2014 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 17 Mar 2014 10:30:49 -0700 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 10:01 AM, Aron Ahmadia wrote: > > On Mon, Mar 17, 2014 at 7:53 AM, Nathaniel Smith wrote: > >> The thread so far, it sounds like the consensus answer is "meh, >> whatever". So I'm thinking we should just drop @@ from the PEP, and if >> it turns out that this is a problem we can always revisit it in the >> ~3.6/3.7 timeframe. >> > > +1 from here. > +1 too. Absent *clear* enthusiasm and support for new syntax/operators, I think being conservative and slow is the right approach. Just having @ will give us data and experience with this space, and it may become clear after one more cycle that we really need/want @@, or not, as the case may be. But it's easier to add it later if we really need it than to remove it if it proves to be a bad idea, so +1 for moving slowly on this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Mar 17 14:55:21 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 17 Mar 2014 14:55:21 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 1:18 PM, wrote: > > > > On Mon, Mar 17, 2014 at 12:50 PM, Alexander Belopolsky wrote: > >> >> On Mon, Mar 17, 2014 at 12:13 PM, Nathaniel Smith wrote: >> >>> In practice all >>> well-behaved classes have to make sure that they implement __special__ >>> methods in such a way that all the different variations work, no >>> matter which class ends up actually handling the operation. >>> >> >> "Well-behaved classes" are hard to come by in practice. The @ operator >> may fix the situation with np.matrix, so take a look at MaskedArray with >> its 40-line __array_wrap__ and no end of bugs. >> >> Requiring superclass __method__ to handle creation of subclass results >> correctly is turning Liskov principle on its head. With enough clever >> tricks and tight control over the full class hierarchy you can make it work >> in some cases, but it is not a good design. >> >> I am afraid that making @ special among other binary operators that >> implement mathematically associative operations will create a lot of >> confusion. (The pow operator is special because the corresponding >> mathematical operation is non-associative.) >> >> Imagine teaching someone that a % b % c = (a % b) % c, but a @ b @ c = a >> @ (b @ c). What are the chances that they will correctly figure out what a >> // b // c means after this? >> > > One case where we need to keep track of left or right is type promotion > > >>> a.shape > (100,) > >>> 1. * a.dot(a) > -98.0 > >>> (1.*a).dot(a) > 328350.0 > >>> a.dtype > dtype('int8') > > >>> 1. * a @ a > ??? > > similar to > >>> 1. * 2 / 3 > 0.6666666666666666 > >>> 1. * (2 / 3) # I'm not in the `future` > 0.0 > I thought of sending a message with I'm +-1 on either, but I'm not I'm again in favor of "left", because it's the simplest to understand A.dot(B).dot(C) with some * mixed in I understand now the computational argument in favor of right x @ inv(x.T @ x) @ x.T @ y ( with shapes T,k k,k k,T T,1 ) or x @ pinv(x) @ y (with shapes T,k k,T T,1 ) with with T>>k (last 1 could be a m>1 with T>>m) However, we don't write code like that most of the time. Alan's students won't care much if some intermediate arrays blow up. In library code like in statsmodels it's almost always a conscious choice of where to set the parenthesis and, more often, which part of a long array expression is taken out as a temporary or permanent variable. I think almost the only uses of chain_dot(A, B, C) (which is "right") is for quadratic forms xtxi = pinv(np.dot(exog.T, exog)) # k,k xtdx = np.dot(exog.T * d[np.newaxis, :], exog) # k,k vcov = chain_dot(xtxi, xtdx, xtxi) # kk, kk, kk (from Quantreg) I think optimizing this way is relatively easy On the other hand, I worry a lot more about messy cases with different dtypes or different classes involved as Alexander has pointed out. Cases that might trip up medium to medium-advanced numpy users. (Let's see, I have to read @ back to front, and * front to back, and why did I put a sparse matrix in the middle and a masked array at the end. Oh no, that's not a masked array it's a panda.) compared to (Somewhere there is a mistake, let's go through all terms from the beginning to the end) Josef > > Josef > > > >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >>> 1. * a.dot(a) > -98.0 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Mar 17 15:31:12 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 17 Mar 2014 15:31:12 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 2:55 PM, wrote: > I'm again in favor of "left", because it's the simplest to understand > A.dot(B).dot(C) > +1 Note that for many years to come the best option for repeated matrix product will be A.dot(B).dot(C) ... People who convert their dot(dot(dot('s to more readable method call syntax now should not be forced to change the order or add parentheses when they switch to @. (Full disclosure: I am one of those people having recently converted a large Numeric-based project to NumPy.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rowen at uw.edu Mon Mar 17 16:37:35 2014 From: rowen at uw.edu (Russell E. Owen) Date: Mon, 17 Mar 2014 13:37:35 -0700 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' References: Message-ID: In article , Nathaniel Smith wrote: > OPTION 1 FOR @: > Precedence: same as * > Associativity: left > My shorthand name for it: "same-left" (yes, very creative) > > This means that if you don't use parentheses, you get: > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> (a @ b) * c > > OPTION 2 FOR @: > Precedence: more-weakly-binding than * > Associativity: right > My shorthand name for it: "weak-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> (a * b) @ c > a @ b * c -> a @ (b * c) > > OPTION 3 FOR @: > Precedence: more-tightly-binding than * > Associativity: right > My shorthand name for it: "tight-right" > > This means that if you don't use parentheses, you get: > a @ b @ c -> a @ (b @ c) > a * b @ c -> a * (b @ c) > a @ b * c -> (a @ b) * c > > We need to pick which of which options we think is best, based on whatever > reasons we can think of, ideally more than "hmm, weak-right gives me warm > fuzzy feelings" ;-). (In principle the other 2 possible options are > tight-left and weak-left, but there doesn't seem to be any argument in > favor of either, so we'll leave them out of the discussion.) After seeing all the traffic on this thread, I am in favor of "same-left" because it is easiest to remember: - It introduces no new rules. - It is unambiguous. If we pick option 2 or 3 we have no strong reason to favor one over the other, leaving users to guess. To my mind, being able to easily reason about code you are reading is more important that hoping to increase efficiency for one common case when not using parenthesis. It also has the advantage that it needs the least justification. -- Russell From projetmbc at gmail.com Mon Mar 17 17:32:08 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 17 Mar 2014 22:32:08 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Hello, and what about something like that ? a @ b @ c -> (a @ b) @ c a * b @ c -> (a * b) @ c a @ b * c -> a @ (b * c) Easy to remember. The *-product has priority to @-product, and then we just to @-product from left to right. An advantage of this is that parsers do job from left to right so I realy think that is a better choice than the weak-right. Christophe BAL 2014-03-17 21:37 GMT+01:00 Russell E. Owen : > In article > , > Nathaniel Smith wrote: > > > OPTION 1 FOR @: > > Precedence: same as * > > Associativity: left > > My shorthand name for it: "same-left" (yes, very creative) > > > > This means that if you don't use parentheses, you get: > > a @ b @ c -> (a @ b) @ c > > a * b @ c -> (a * b) @ c > > a @ b * c -> (a @ b) * c > > > > OPTION 2 FOR @: > > Precedence: more-weakly-binding than * > > Associativity: right > > My shorthand name for it: "weak-right" > > > > This means that if you don't use parentheses, you get: > > a @ b @ c -> a @ (b @ c) > > a * b @ c -> (a * b) @ c > > a @ b * c -> a @ (b * c) > > > > OPTION 3 FOR @: > > Precedence: more-tightly-binding than * > > Associativity: right > > My shorthand name for it: "tight-right" > > > > This means that if you don't use parentheses, you get: > > a @ b @ c -> a @ (b @ c) > > a * b @ c -> a * (b @ c) > > a @ b * c -> (a @ b) * c > > > > We need to pick which of which options we think is best, based on > whatever > > reasons we can think of, ideally more than "hmm, weak-right gives me warm > > fuzzy feelings" ;-). (In principle the other 2 possible options are > > tight-left and weak-left, but there doesn't seem to be any argument in > > favor of either, so we'll leave them out of the discussion.) > > After seeing all the traffic on this thread, I am in favor of > "same-left" because it is easiest to remember: > - It introduces no new rules. > - It is unambiguous. If we pick option 2 or 3 we have no strong reason > to favor one over the other, leaving users to guess. > > To my mind, being able to easily reason about code you are reading is > more important that hoping to increase efficiency for one common case > when not using parenthesis. > > It also has the advantage that it needs the least justification. > > -- Russell > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Mon Mar 17 17:34:49 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 17 Mar 2014 22:34:49 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Sorry for all the misspellings... 2014-03-17 22:32 GMT+01:00 Christophe Bal : > Hello, > and what about something like that ? > > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> a @ (b * c) > > Easy to remember. The *-product has priority to @-product, and then we > just to @-product from left to right. > > An advantage of this is that parsers do job from left to right so I realy > think that is a better choice than the weak-right. > > Christophe BAL > > > > 2014-03-17 21:37 GMT+01:00 Russell E. Owen : > > In article >> , >> Nathaniel Smith wrote: >> >> > OPTION 1 FOR @: >> > Precedence: same as * >> > Associativity: left >> > My shorthand name for it: "same-left" (yes, very creative) >> > >> > This means that if you don't use parentheses, you get: >> > a @ b @ c -> (a @ b) @ c >> > a * b @ c -> (a * b) @ c >> > a @ b * c -> (a @ b) * c >> > >> > OPTION 2 FOR @: >> > Precedence: more-weakly-binding than * >> > Associativity: right >> > My shorthand name for it: "weak-right" >> > >> > This means that if you don't use parentheses, you get: >> > a @ b @ c -> a @ (b @ c) >> > a * b @ c -> (a * b) @ c >> > a @ b * c -> a @ (b * c) >> > >> > OPTION 3 FOR @: >> > Precedence: more-tightly-binding than * >> > Associativity: right >> > My shorthand name for it: "tight-right" >> > >> > This means that if you don't use parentheses, you get: >> > a @ b @ c -> a @ (b @ c) >> > a * b @ c -> a * (b @ c) >> > a @ b * c -> (a @ b) * c >> > >> > We need to pick which of which options we think is best, based on >> whatever >> > reasons we can think of, ideally more than "hmm, weak-right gives me >> warm >> > fuzzy feelings" ;-). (In principle the other 2 possible options are >> > tight-left and weak-left, but there doesn't seem to be any argument in >> > favor of either, so we'll leave them out of the discussion.) >> >> After seeing all the traffic on this thread, I am in favor of >> "same-left" because it is easiest to remember: >> - It introduces no new rules. >> - It is unambiguous. If we pick option 2 or 3 we have no strong reason >> to favor one over the other, leaving users to guess. >> >> To my mind, being able to easily reason about code you are reading is >> more important that hoping to increase efficiency for one common case >> when not using parenthesis. >> >> It also has the advantage that it needs the least justification. >> >> -- Russell >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Mon Mar 17 17:38:42 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 17 Mar 2014 22:38:42 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Here is the translation. ;-) Hello, and what about something like that ? *a @ b @ c -> (a @ b) @ c* *a * b @ c -> (a * b) @ c* *a @ b * c -> a @ (b * c)* Easy to remember: the *-product has priority regarding to the @-product, and we just do @-product from left to right. An advantage of this is that most parsers do analyze from left to right. So I really think that it is a better choice than the weak-right one. Christophe BAL 2014-03-17 22:34 GMT+01:00 Christophe Bal : > Sorry for all the misspellings... > > > 2014-03-17 22:32 GMT+01:00 Christophe Bal : > > Hello, >> and what about something like that ? >> >> a @ b @ c -> (a @ b) @ c >> a * b @ c -> (a * b) @ c >> a @ b * c -> a @ (b * c) >> >> Easy to remember. The *-product has priority to @-product, and then we >> just to @-product from left to right. >> >> An advantage of this is that parsers do job from left to right so I realy >> think that is a better choice than the weak-right. >> >> Christophe BAL >> >> >> >> 2014-03-17 21:37 GMT+01:00 Russell E. Owen : >> >> In article >>> , >>> Nathaniel Smith wrote: >>> >>> > OPTION 1 FOR @: >>> > Precedence: same as * >>> > Associativity: left >>> > My shorthand name for it: "same-left" (yes, very creative) >>> > >>> > This means that if you don't use parentheses, you get: >>> > a @ b @ c -> (a @ b) @ c >>> > a * b @ c -> (a * b) @ c >>> > a @ b * c -> (a @ b) * c >>> > >>> > OPTION 2 FOR @: >>> > Precedence: more-weakly-binding than * >>> > Associativity: right >>> > My shorthand name for it: "weak-right" >>> > >>> > This means that if you don't use parentheses, you get: >>> > a @ b @ c -> a @ (b @ c) >>> > a * b @ c -> (a * b) @ c >>> > a @ b * c -> a @ (b * c) >>> > >>> > OPTION 3 FOR @: >>> > Precedence: more-tightly-binding than * >>> > Associativity: right >>> > My shorthand name for it: "tight-right" >>> > >>> > This means that if you don't use parentheses, you get: >>> > a @ b @ c -> a @ (b @ c) >>> > a * b @ c -> a * (b @ c) >>> > a @ b * c -> (a @ b) * c >>> > >>> > We need to pick which of which options we think is best, based on >>> whatever >>> > reasons we can think of, ideally more than "hmm, weak-right gives me >>> warm >>> > fuzzy feelings" ;-). (In principle the other 2 possible options are >>> > tight-left and weak-left, but there doesn't seem to be any argument in >>> > favor of either, so we'll leave them out of the discussion.) >>> >>> After seeing all the traffic on this thread, I am in favor of >>> "same-left" because it is easiest to remember: >>> - It introduces no new rules. >>> - It is unambiguous. If we pick option 2 or 3 we have no strong reason >>> to favor one over the other, leaving users to guess. >>> >>> To my mind, being able to easily reason about code you are reading is >>> more important that hoping to increase efficiency for one common case >>> when not using parenthesis. >>> >>> It also has the advantage that it needs the least justification. >>> >>> -- Russell >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 17 18:02:33 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 22:02:33 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 9:38 PM, Christophe Bal wrote: > Here is the translation. ;-) > > Hello, > and what about something like that ? > > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> a @ (b * c) > > Easy to remember: the *-product has priority regarding to the @-product, and > we just do @-product from left to right. In the terminology we've been using in this thread, this is "weak-left". > An advantage of this is that most parsers do analyze from left to right. > > So I really think that it is a better choice than the weak-right one. We've mostly ignored this option because of assuming that if we want left-associativity, we should go with "same-left" instead of "weak-left". Same-left is: a @ b @ c -> (a @ b) @ c a * b @ c -> (a * b) @ c a @ b * c -> (a @ b) * c i.e., even more left-to-right than weak-left :-) Do you think weak-left is better than same-left? -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From jtaylor.debian at googlemail.com Mon Mar 17 18:17:02 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 17 Mar 2014 23:17:02 +0100 Subject: [Numpy-discussion] GSoC project: draft of proposal In-Reply-To: References: Message-ID: <5327745E.6080405@googlemail.com> On 12.03.2014 17:52, Leo Mao wrote: > Hi, > The attachment is my draft of proposal. The project is "vector math > library integration". > I think I need some feedback to make it solider. > Any comment will be appreciated. > Thanks in advance. > hi, I finally had some time too properly look at your proposal, here are my comments. First of all I hope you are aware this is a very challenging project as you will have to deal with issues of several different areas: build systems, portability, low level performance, numerical issues, testing and the in some places quite daunting numpy codebase. I do fear that it might be too much for a first year student. Your proposal is lacking some information on your experiences. Are you already familiar with vectorization via SIMD? While the goal of this project is partly to avoid writing more vector code in NumPy it is still very useful if you are familiar with how it works. If you have no experience maybe add some time to learning the basics to the schedule. The numerical accuracy of the vector library needs to be evaluated, I suspect that this might be the biggest roadblock in adding support by default. The performance of the library over different value ranges also needs to investigated. What kind of hardware do you have at your disposal? SIMD vectorization performance is very hardware dependent, you probably want at least a intel sandy bridge or AMD bulldozer type cpu to get the most out of the library, those CPUs have the newish AVX SIMD instructions. While I think your schedule is already packed, another point you could add if you have extra time is extending the existing SSE vectorized code in numpy to AVX if the vector library does not provide an equivalent (e.g. probably the boolean stuff) The runtime feature detection vector libraries provide can be very useful for this. Regards, Julian Taylor From projetmbc at gmail.com Mon Mar 17 18:33:28 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Mon, 17 Mar 2014 23:33:28 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I think that weak-left is a little strange, just think a little of the operators used by mathematicians that always follow a hierarchy. A parser is mostly done using grammars : see http://docs.python.org/3.1/reference/grammar.html. Defining *-product to have stronger priority than the @-product, and this last having stronger priority than +, will make the changes in the grammar easier. I'm now convinced of the usefulness of @ and @@ too but I also think that you must think of other uses than only for numpy. In other words, numpy is a the good argument for this new operators, but this can also open new perspectives for other uses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Mon Mar 17 19:00:32 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 17 Mar 2014 19:00:32 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 6:33 PM, Christophe Bal wrote: > > Defining *-product to have stronger priority than the @-product, and this > last having stronger priority than +, will make the changes in the grammar > easier. > The easiest is to give @ the same precedence as *. This will only require changing term: factor (('*'|'/'|'%'|'//') factor)* to term: factor (('*'|'/'|'%'|'//'|'@') factor)* Anything else will require an extra rule, but in any case implementation is trivial. I don't think we need to worry about implementation details at this point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrbago at gmail.com Mon Mar 17 19:16:23 2014 From: mrbago at gmail.com (Bago) Date: Mon, 17 Mar 2014 16:16:23 -0700 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: > > > I'm now convinced of the usefulness of @ and @@ too but I also think that > you must think of other uses than only for numpy. In other words, numpy is > a the good argument for this new operators, but this can also open new > perspectives for other uses. > > Speaking of `@@`, would the relative precedence of @ vs * be the same as @@ vs **? -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Mon Mar 17 19:21:27 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 00:21:27 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: First of all I'm must be very tired because I've written *"I think that weak-left is a little strange..."* instead of *"I think that same-left is a little strange..."*. It is the night in french... ;-) So I'm definitely for the weak-left ! Here is my answer to Alexander Belopolsky. You are right from a grammar point of view but for a human this looks too weird because * and @ are of different kinds contrary to * and / for example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Mon Mar 17 19:23:15 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 00:23:15 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: If you see the operators as following a hierarchy, the answer is simply yes. 2014-03-18 0:16 GMT+01:00 Bago : > >> I'm now convinced of the usefulness of @ and @@ too but I also think that >> you must think of other uses than only for numpy. In other words, numpy is >> a the good argument for this new operators, but this can also open new >> perspectives for other uses. >> >> > Speaking of `@@`, would the relative precedence of @ vs * be the same as > @@ vs **? > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 17 19:25:15 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 23:25:15 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 11:16 PM, Bago wrote: > Speaking of `@@`, would the relative precedence of @ vs * be the same as @@ > vs **? This is one of the concerns that made Guido leery of @@ (but only one of them). Since we seem to be dropping @@: http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069502.html we don't have to come up with an answer :-). -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From josef.pktd at gmail.com Mon Mar 17 19:30:44 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 17 Mar 2014 19:30:44 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 6:33 PM, Christophe Bal wrote: > I think that weak-left is a little strange, just think a little of the > operators used by mathematicians that always follow a hierarchy. > > A parser is mostly done using grammars : see > http://docs.python.org/3.1/reference/grammar.html. > > Defining *-product to have stronger priority than the @-product, and this > last having stronger priority than +, will make the changes in the grammar > easier. > > I'm now convinced of the usefulness of @ and @@ too but I also think that > you must think of other uses than only for numpy. In other words, numpy is > a the good argument for this new operators, but this can also open new > perspectives for other uses. > My main problem with weak-left (* higher) and tight-left (@ higher) compared to same-left is that I don't see any obvious choice between the weak and tight. I don't think I would have problems with readability. Wikipedia doesn't say anything about precedence of Hadamard versus matrix product. matlab, IDL and Gauss (I checked the manual) all use same-left, as Nathaniel pointed out. For scalar * together with dot product which is more common in formulas, we would just read it sequentially, i.e. same-left. I don't remember when I have seen dot-in-a-circle in a paper, but I don't think there was any precedence either. --- I guess the same applies for other (mis)uses of @ from math import sqrt class MyOp(object): def __init__(self, func): self.func = func def __at__(self, x): return [self.func(xi) for xi in x] myop = MyOp(lambda x: sqrt(x)) print myop.__at__(range(3)) # myop @ range(5) print myop.__at__(range(3) * 2) # myop @ (range(5) * 2) print myop.__at__(range(3)) * 3 # myop @ range(5) * 3 ''' [0.0, 1.0, 1.4142135623730951] [0.0, 1.0, 1.4142135623730951, 0.0, 1.0, 1.4142135623730951] [0.0, 1.0, 1.4142135623730951, 0.0, 1.0, 1.4142135623730951, 0.0, 1.0, 1.4142135623730951] ''' ------------- Josef > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 17 19:30:57 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 17 Mar 2014 23:30:57 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 10:33 PM, Christophe Bal wrote: > I think that weak-left is a little strange, just think a little of the > operators used by mathematicians that always follow a hierarchy. Not sure what you mean -- I don't think most mathematicians think that scalar and matrix multiplication are above or below each other in precedence, for example. (Well, it's a strange question because scalar multiplication commutes, but even so, people often forget that these are even different operations.) > A parser is mostly done using grammars : see > http://docs.python.org/3.1/reference/grammar.html. > > Defining *-product to have stronger priority than the @-product, and this > last having stronger priority than +, will make the changes in the grammar > easier. > > I'm now convinced of the usefulness of @ and @@ too but I also think that > you must think of other uses than only for numpy. In other words, numpy is a > the good argument for this new operators, but this can also open new > perspectives for other uses. No, that's not how this game is played :-). The way it works is, we figure out the best possible way to handle the use case that we've demonstrated a need for (matrix multiplication), and then once we've done that someone might or might not find some other uses too. If they do then cool, if not then too bad. This follows the principle that it's better to be great at some things than to be mediocre at everything. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From projetmbc at gmail.com Mon Mar 17 20:16:59 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 01:16:59 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: >>> This follows the principle that it's better to be great >>> at some things than to be mediocre at everything. You're right. >>> >>> I think that weak-left is a little strange, just think >>> >>> a little of the operators used by mathematicians that >>> >>> always follow a hierarchy. >>> Not sure what you mean -- I don't think most mathematicians >>> think that scalar and matrix multiplication are above or below >>> each other in precedence, for example. You're right but on the other hand, I've never seen mixed use of matrix and scalar products without parenthesis... Indeed in math, we can use < Au , Bv > for the scalar product of two matrix-vector products. But here, I think that the situation is different because we are talking about operators from arrays to array : mainly @ , * and + (elementwise for the two last). Whereas in the preceding example, the scalar product is from arrays to scalar. As a math user, I think at this point that the arrays-to-array operators must follows a hierarchy. Who is the guy who have asked such a complicated question about precedence ? :-) 2014-03-18 0:30 GMT+01:00 Nathaniel Smith : > On Mon, Mar 17, 2014 at 10:33 PM, Christophe Bal > wrote: > > I think that weak-left is a little strange, just think a little of the > > operators used by mathematicians that always follow a hierarchy. > > Not sure what you mean -- I don't think most mathematicians think that > scalar and matrix multiplication are above or below each other in > precedence, for example. (Well, it's a strange question because scalar > multiplication commutes, but even so, people often forget that these > are even different operations.) > > > A parser is mostly done using grammars : see > > http://docs.python.org/3.1/reference/grammar.html. > > > > Defining *-product to have stronger priority than the @-product, and this > > last having stronger priority than +, will make the changes in the > grammar > > easier. > > > > I'm now convinced of the usefulness of @ and @@ too but I also think that > > you must think of other uses than only for numpy. In other words, numpy > is a > > the good argument for this new operators, but this can also open new > > perspectives for other uses. > > No, that's not how this game is played :-). The way it works is, we > figure out the best possible way to handle the use case that we've > demonstrated a need for (matrix multiplication), and then once we've > done that someone might or might not find some other uses too. If they > do then cool, if not then too bad. This follows the principle that > it's better to be great at some things than to be mediocre at > everything. > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 17 20:29:39 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 00:29:39 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 12:16 AM, Christophe Bal wrote: >>>> >>> I think that weak-left is a little strange, just think >>>> >>> a little of the operators used by mathematicians that >>>> >>> always follow a hierarchy. > >>>> Not sure what you mean -- I don't think most mathematicians >>>> think that scalar and matrix multiplication are above or below >>>> each other in precedence, for example. > > You're right but on the other hand, I've never seen mixed use of matrix and > scalar products without parenthesis... Indeed in math, we can use < Au , Bv >> for the scalar product of two matrix-vector products. Not scalar product, scalar multiplication -- you're saying (I think) that 3 * Matrix1 * Matrix2 is just like 3 * Matrix1 + Matrix2 in the sense that mathematicians think of the 3 * Matrix1 part is very different from, and higher precedence than, the Matrix1 + Matrix2 part. And similarly that Matrix1 * Matrix2 * 3 is just like Matrix1 + Matrix2 * 3 But in fact I think if you asked most mathematicians which of the "*"'s in Matrix1 * Matrix2 * 3 is higher precedence, they would think this question very odd! -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Mon Mar 17 20:54:13 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 00:54:13 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith wrote: > Mathematica: instead of having an associativity, a @ b @ c gets > converted into mdot([a, b, c]) So, I've been thinking about this (thanks to @rfateman for pointing it out), and wondering if Mathematica's approach is worth following up more. (It would need to make it past python-dev, of course, but worst case is just that they say no and we're back where we are now, so we might as well think it through.) Here's how it would work: Currently Python has 3 different kinds of ops: left-associative (most of them), right-associative (**), and "chaining". Chaining is used for comparison ops. Example: a < b < c gets parsed to something like do_comparison(args=[a, b, c], ops=[lt, lt]) Notice this is very different from either of (a < b) < c a < (b < c) Which means that comparisons aren't left- OR right-associative, they're this other thing, "chaining". So we could propose adding a 4th kind of op, calling "grouping", which would be only @. And the idea is that a @ b @ c would be equivalent to operator.matmul((a, b, c)) which eventually (see below) becomes a call to a.__matmul__((a, b, c)) We'd use exactly the same parsing rules as the chaining ops, so you can still control evaluation order with parentheses if you want: a @ (b @ c) -> matmul((a, matmul((b, c)))) (a @ b) @ c -> matmul((matmul((a, c)), c)) ...but if you don't specify, then each contiguous group of @ operators gets collected up and handed to __matmul__ together, and the __matmul__ implementation gets to decide which evaluation strategy to use. It's trivially fast for the computer to figure out the best evaluation order for matrix multiplication, so in practice I think this would mean that you could just stop worrying about parentheses for multiple contiguous calls to matmul. Fancier versions of __matmul__ defined on more specialized non-ndarray classes might even take into account their specialized knowledge of how expensive different evaluation orders are for their specific type -- I'm not sure if this actually happens in practice, but it might. (Like maybe the best way to evaluate a @ b @ c depends on the sparsity pattern in the various matrices, or maybe it depends on which matrices are currently on the GPU and which are in memory? Anyone have any real examples of this?) (Of course, this same evaluation-order problem arises for *all* expressions using numpy; being able to optimize whole expressions at a time is exactly what numexpr/numba/theano/etc. are useful for. So one could argue that "baking it in" to @ is pointless, if anyone gets tired of writing parentheses they should just use one of these libraries. Or maybe evaluation order problems arise so rarely for @ that no-one cares. But OTOH it would be so nice for once to just have a single best solution -- "always use @ and be happy, it just works" -- instead of all the caveats we normally do -- "@ is good in some cases, but in other cases mdot is better, or if you know you can just use @ with the right parentheses...".) Of course, we still have to say something about what value a @ b @ c actually computes. In the PEP semantics, it isn't always associative -- specifically not if we do Mat @ vec @ Mat. So in this approach, we still need to decide what matmul((Mat, vec, Mat)) should return. But, this is actually a feature! Because obviously what *should* be returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ Mat). Both of those answers are terrible; it's just, if you have an ordinary left-/right-associative operator, those are your only options. What *should* be returned is an error. And in this scheme we get to see the whole @ expression at once, so we actually can raise an error for such things. So, this possibly has nicer performance characteristics, and is also possibly has nicer semantics. Now, how would this look in terms of the language definition? As far as the parser and AST go, this would use exactly the same rules as the chaining ops, so that's easy. Having parsed, we must evaluate. Likely the most contentious part of this approach is that we now have an n-arg operator, so the standard __X__/__rX__ dichotomy won't work, we need to do something like multiple dispatch. I haven't followed the debate on this issue in detail, but what I'd propose for this limited context is not to do anything like "real" multiple dispatch, but just directly generalize the familiar __rX__ rule to n arguments. The __rX__ rule is how Python's existing binary operators work: usually to evaluate a # b, you try a.__foo__, and then b.__foo__ EXCEPT if b is a proper subclass of a, you try b first. Generalized to >2 arguments, this looks like: def operator.matmul(args): candidates = list(args) while candidates: candidate = pop_next_candidate(candidates) if hasattr(candidate, "__matmul__"): result = candidate.__matmul__(args) if result is not NotImplemented: return result raise TypeError def pop_next_candidate(candidates): classes = [c.__class__ for c in candidates] # We'll try the left-most remaining candidate... for i in range(len(candidates)): # ...unless there's a later, untried candidate that's a proper subclass. if not has_proper_subclass(classes[i], classes): return candidates.pop(i) assert False def has_proper_subclass(class_, other_classes): for other_class in other_classes: if (issubclass(other_class, class_) and not issubclass(class_, other_class)): return True return False ...which, it turns out, is exactly the lookup rule that __numpy_ufunc__ will use, so at least it isn't too weird from our point of view: http://docs.scipy.org/doc/numpy-dev/reference/arrays.classes.html#numpy.class.__numpy_ufunc__ There are still plenty of details to think about, e.g.: What does the in-place operator do? I think a @= b @ c would have to be the same as a = a @ (b @ c) and NOT a = a @ b @ c because otherwise what do you do with a @= b @ c + d. I'm not sure how this would interact with implementing np.dot (which we'd still need for its out= argument, and will I guess do dispatch through the __numpy_ufunc__ mechanism?). We should probably work through this in detail. We'd still need to pick a precedence for @: grouping is different than left-associativity, so we can't do same-group, we'd have to pick either tight-group or weak-group. My gut feeling is that tight-group makes more sense that weak-group, because if @ is a magical thing that collects up a group of items, then it is helpful if there's a simple visual mapping where the group starts just to the left of the first @ and extends just to the right of the last @. I'm still not at all sure this rigmorale is worth it -- I still think we need some data on how often people chain together multiple @ or np.dot calls. Still, I thought I'd throw this out here and see what people think of it. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Mon Mar 17 20:56:41 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 00:56:41 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 8:37 PM, Russell E. Owen wrote: > After seeing all the traffic on this thread, I am in favor of > "same-left" because it is easiest to remember: > - It introduces no new rules. > - It is unambiguous. If we pick option 2 or 3 we have no strong reason > to favor one over the other, leaving users to guess. > > To my mind, being able to easily reason about code you are reading is > more important that hoping to increase efficiency for one common case > when not using parenthesis. Personally I'm leaning in a similar direction (at least as far as left- versus right-associativity goes; I'm not sure yet what I think about the magic "grouping" thing I just posted :-)). The more I think about it, the weaker I find the avoiding-parentheses argument. If you're going to take the trouble to think about which ordering is best, you should write that down with parentheses no matter what the associativity is, so that when I have to read your code I'll see the parentheses and know that you thought about it! And certainly the slow part of this is not typing the parentheses, it's figuring out what order is best. (The potential advantage of "grouping" isn't that you don't have to write as many parentheses, it's that you don't have to *think* about parentheses.) The fact that Matlab et al get along fine with same-left also strikes me as strong evidence that right-associativity's benefits are at least not overwhelmingly compelling... -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndarray at mac.com Mon Mar 17 21:25:30 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 17 Mar 2014 21:25:30 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith wrote: > > Currently Python has 3 different kinds of ops: left-associative (most > of them), right-associative (**), and "chaining". Chaining is used for > comparison ops. Example: > > a < b < c > > gets parsed to something like > > do_comparison(args=[a, b, c], ops=[lt, lt]) The actual parse tree is more like Compare(a, [lt, lt], [b, c]) with the first aruments playing a distinct role: >>> ast.dump(ast.parse('a From jaime.frio at gmail.com Mon Mar 17 21:29:34 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Mon, 17 Mar 2014 18:29:34 -0700 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mar 17, 2014 5:54 PM, "Nathaniel Smith" wrote: > > On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith wrote: > > Mathematica: instead of having an associativity, a @ b @ c gets > > converted into mdot([a, b, c]) > > So, I've been thinking about this (thanks to @rfateman for pointing it > out), and wondering if Mathematica's approach is worth following up > more. (It would need to make it past python-dev, of course, but worst > case is just that they say no and we're back where we are now, so we > might as well think it through.) > > Here's how it would work: > > Currently Python has 3 different kinds of ops: left-associative (most > of them), right-associative (**), and "chaining". Chaining is used for > comparison ops. Example: > > a < b < c > > gets parsed to something like > > do_comparison(args=[a, b, c], ops=[lt, lt]) > > Notice this is very different from either of > > (a < b) < c > a < (b < c) > > Which means that comparisons aren't left- OR right-associative, > they're this other thing, "chaining". > > So we could propose adding a 4th kind of op, calling "grouping", which > would be only @. And the idea is that > > a @ b @ c > > would be equivalent to > > operator.matmul((a, b, c)) > > which eventually (see below) becomes a call to > > a.__matmul__((a, b, c)) > > We'd use exactly the same parsing rules as the chaining ops, so you > can still control evaluation order with parentheses if you want: > > a @ (b @ c) -> matmul((a, matmul((b, c)))) > (a @ b) @ c -> matmul((matmul((a, c)), c)) > > ...but if you don't specify, then each contiguous group of @ operators > gets collected up and handed to __matmul__ together, and the > __matmul__ implementation gets to decide which evaluation strategy to > use. > > It's trivially fast for the computer to figure out the best evaluation > order for matrix multiplication, so in practice I think this would > mean that you could just stop worrying about parentheses for multiple > contiguous calls to matmul. Fancier versions of __matmul__ defined on > more specialized non-ndarray classes might even take into account > their specialized knowledge of how expensive different evaluation > orders are for their specific type -- I'm not sure if this actually > happens in practice, but it might. (Like maybe the best way to > evaluate a @ b @ c depends on the sparsity pattern in the various > matrices, or maybe it depends on which matrices are currently on the > GPU and which are in memory? Anyone have any real examples of this?) > > (Of course, this same evaluation-order problem arises for *all* > expressions using numpy; being able to optimize whole expressions at a > time is exactly what numexpr/numba/theano/etc. are useful for. So one > could argue that "baking it in" to @ is pointless, if anyone gets > tired of writing parentheses they should just use one of these > libraries. Or maybe evaluation order problems arise so rarely for @ > that no-one cares. But OTOH it would be so nice for once to just have > a single best solution -- "always use @ and be happy, it just works" > -- instead of all the caveats we normally do -- "@ is good in some > cases, but in other cases mdot is better, or if you know you can just > use @ with the right parentheses...".) > > Of course, we still have to say something about what value a @ b @ c > actually computes. In the PEP semantics, it isn't always associative > -- specifically not if we do Mat @ vec @ Mat. So in this approach, we > still need to decide what > matmul((Mat, vec, Mat)) > should return. > > But, this is actually a feature! Because obviously what *should* be > returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ > Mat). Both of those answers are terrible; it's just, if you have an > ordinary left-/right-associative operator, those are your only > options. What *should* be returned is an error. And in this scheme we > get to see the whole @ expression at once, so we actually can raise an > error for such things. > > So, this possibly has nicer performance characteristics, and is also > possibly has nicer semantics. > > Now, how would this look in terms of the language definition? > > As far as the parser and AST go, this would use exactly the same rules > as the chaining ops, so that's easy. > > Having parsed, we must evaluate. Likely the most contentious part of > this approach is that we now have an n-arg operator, so the standard > __X__/__rX__ dichotomy won't work, we need to do something like > multiple dispatch. I haven't followed the debate on this issue in > detail, but what I'd propose for this limited context is not to do > anything like "real" multiple dispatch, but just directly generalize > the familiar __rX__ rule to n arguments. The __rX__ rule is how > Python's existing binary operators work: usually to evaluate a # b, > you try a.__foo__, and then b.__foo__ EXCEPT if b is a proper subclass > of a, you try b first. Generalized to >2 arguments, this looks like: > > def operator.matmul(args): > candidates = list(args) > while candidates: > candidate = pop_next_candidate(candidates) > if hasattr(candidate, "__matmul__"): > result = candidate.__matmul__(args) > if result is not NotImplemented: > return result > raise TypeError > > def pop_next_candidate(candidates): > classes = [c.__class__ for c in candidates] > # We'll try the left-most remaining candidate... > for i in range(len(candidates)): > # ...unless there's a later, untried candidate that's a proper subclass. > if not has_proper_subclass(classes[i], classes): > return candidates.pop(i) > assert False > > def has_proper_subclass(class_, other_classes): > for other_class in other_classes: > if (issubclass(other_class, class_) > and not issubclass(class_, other_class)): > return True > return False > > ...which, it turns out, is exactly the lookup rule that > __numpy_ufunc__ will use, so at least it isn't too weird from our > point of view: > http://docs.scipy.org/doc/numpy-dev/reference/arrays.classes.html#numpy.class.__numpy_ufunc__ > > There are still plenty of details to think about, e.g.: > > What does the in-place operator do? I think > a @= b @ c > would have to be the same as > a = a @ (b @ c) > and NOT > a = a @ b @ c > because otherwise what do you do with > a @= b @ c + d. You cannot do inplace matrix multiplication without an intermediate copy of the first array, or at least of each row of the first array in turns. I don't like the idea of providing syntax that looks faster but isn't, I'd rather see @= return an error in the context of matrix multiplication. > I'm not sure how this would interact with implementing np.dot (which > we'd still need for its out= argument, and will I guess do dispatch > through the __numpy_ufunc__ mechanism?). We should probably work > through this in detail. > > We'd still need to pick a precedence for @: grouping is different than > left-associativity, so we can't do same-group, we'd have to pick > either tight-group or weak-group. My gut feeling is that tight-group > makes more sense that weak-group, because if @ is a magical thing that > collects up a group of items, then it is helpful if there's a simple > visual mapping where the group starts just to the left of the first @ > and extends just to the right of the last @. > > I'm still not at all sure this rigmorale is worth it -- I still think > we need some data on how often people chain together multiple @ or > np.dot calls. > > Still, I thought I'd throw this out here and see what people think of it. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Mon Mar 17 23:01:41 2014 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Mon, 17 Mar 2014 23:01:41 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 56 In-Reply-To: References: Message-ID: <5327B715.4040407@gmail.com> Julian, I can see the need to recognize both column and row vectors, but why not with np.matrix? I can see no need for a new operator and hope to be able to comment more fully on PEP 465 in a few days. Colin W. On 17-Mar-2014 7:19 PM, numpy-discussion-request at scipy.org wrote: > Send NumPy-Discussion mailing list submissions to > numpy-discussion at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.scipy.org/mailman/listinfo/numpy-discussion > or, via email, send a message with subject or body 'help' to > numpy-discussion-request at scipy.org > > You can reach the person managing the list at > numpy-discussion-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of NumPy-Discussion digest..." > > > Today's Topics: > > 1. Re: [help needed] associativity and precedence of '@' > (Nathaniel Smith) > 2. Re: GSoC project: draft of proposal (Julian Taylor) > 3. Re: [help needed] associativity and precedence of '@' > (Christophe Bal) > 4. Re: [help needed] associativity and precedence of '@' > (Alexander Belopolsky) > 5. Re: [help needed] associativity and precedence of '@' (Bago) > 6. Re: [help needed] associativity and precedence of '@' > (Christophe Bal) > 7. Re: [help needed] associativity and precedence of '@' > (Christophe Bal) > 8. Re: [help needed] associativity and precedence of '@' > (Nathaniel Smith) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 17 Mar 2014 22:02:33 +0000 > From: Nathaniel Smith > Subject: Re: [Numpy-discussion] [help needed] associativity and > precedence of '@' > To: Discussion of Numerical Python > Message-ID: > > Content-Type: text/plain; charset=UTF-8 > > On Mon, Mar 17, 2014 at 9:38 PM, Christophe Bal wrote: >> Here is the translation. ;-) >> >> Hello, >> and what about something like that ? >> >> a @ b @ c -> (a @ b) @ c >> a * b @ c -> (a * b) @ c >> a @ b * c -> a @ (b * c) >> >> Easy to remember: the *-product has priority regarding to the @-product, and >> we just do @-product from left to right. > In the terminology we've been using in this thread, this is "weak-left". > >> An advantage of this is that most parsers do analyze from left to right. >> >> So I really think that it is a better choice than the weak-right one. > We've mostly ignored this option because of assuming that if we want > left-associativity, we should go with "same-left" instead of > "weak-left". Same-left is: > > a @ b @ c -> (a @ b) @ c > a * b @ c -> (a * b) @ c > a @ b * c -> (a @ b) * c > > i.e., even more left-to-right than weak-left :-) > > Do you think weak-left is better than same-left? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daoust.mj at gmail.com Mon Mar 17 23:28:21 2014 From: daoust.mj at gmail.com (Mark Daoust) Date: Mon, 17 Mar 2014 23:28:21 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith wrote: > > But, this is actually a feature! Because obviously what *should* be > returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ > Mat). Both of those answers are terrible; it's just, if you have an > ordinary left-/right-associative operator, those are your only > options. What *should* be returned is an error. And in this scheme we > get to see the whole @ expression at once, so we actually can raise an > error for such things. > Sorry if this is a little off topic. But there's still something about the "vector" examples that bugs me, "matrix at vector" and "vector@@2", keep popping up (this also applies to the matrix at matrix examples to a lesser extent). I'm a little unconformable looking at the shape to to decide what's a matrix and what's a vector. (Matlab has some problems like this) If it only has one or two dimensions it's easy, but I always find that if I've written code that works for 1 matrix or vector, 5 minutes later I want it to work for fields of matrices or vectors. If we're just going by shape there's no way to distinguish between a 2d field of matrices and a 3d field of vectors. I guess this is a repeat of part of what Eelco Hoogendoorn saying a few posts back I was just wondering if anyone sees a place, to get @ a little closer to Einsum, for some sort of array class that understands the difference between a 4D array of scalars, a 3D array of vectors, and a 2D array of matrices... The difference between the axes that broad-cast and the axes that can sum when you hit them with an @ ... or something like that. Just a thought. Einsum is fantastic by the way, totally worth learning and using. Mark Daoust -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Tue Mar 18 02:13:49 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Tue, 18 Mar 2014 07:13:49 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Perhaps this a bit of a thread hyjack; but this discussion got me thinking about how to arrive at a more vectorized/tensorified way of specifying linear algebra operations, in an elegant manner. I probably got a little carried away, but what about this syntax? - indexing/calling an ndarray with a string returns a TensorExpression object - these TensorExpression objects can be combined into a graph using operator overloads - and these graphs are translated to calls to BLAS or einsum, as is appropriate #declare some symbols i,j,ij,k = 'i','j','ij','k' #we may force evaluation of a (sub) TensorExpression by calling it #this is trivial to translate to call to einsum #but such special cases could be dispatched to BLAS as well b = (A(ij) * x(j)) (i) #alternatively, we can predeclare a LHS which is automatically sized later #note that this translates into the same call as the above; just some syntactic sugar b = np.empty(()) b[i] = A(ij) * x(j) #more complex TensorExpression graphs of this form are trivial to translate to a call to einsum as well a(i)*b(j)*c(k) #conceptually, there is no need to limit this scheme to multiplications only! #although such generalizations would require a more complex execution engine #however, the revamped nditer should make this quite managable to implement a(i)*b(j) + c(k) #if axes strings are omitted, standard numpy broadcasting rules are applied to the expressiongraph created #this is identical to a*b+c; except that we have the opportunity to eliminate temporaries a()*b()+c() Note that such an approach kills quite some birds with one stone it allows for the elimination of temporaries along the lines of numexpr But if i could write: b[i] = A[ij] * x[j] I would much prefer that over b = A @ x even though the latter is shorter Now if i had n input and output vectors, it would be easy what to do with them: b[ni] = A[ij] * x[nj] As i argued earlier, I much prefer this form of explicitness over conventions about what constitutes a row or column vector. And vectorization of linear algebra is a trivial extension in this manner, which in itself is just a subset of even more general multilinear products, which themselves are a subset of more general expression involving things other than products Its a somewhat ambitious idea, and there are probably reasons why it isnt a good idea as well, but it does not require python language modifications, and it does not clash with any other functionality or syntax of numpy, as far as i can tell. Calling of arrays is not yet defined, and alternatively array indexing could be overloaded on string type. Either way, something to chew on when deciding on the best way to go forward. On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust wrote: > On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith wrote: > >> >> But, this is actually a feature! Because obviously what *should* be >> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ >> Mat). Both of those answers are terrible; it's just, if you have an >> ordinary left-/right-associative operator, those are your only >> options. What *should* be returned is an error. And in this scheme we >> get to see the whole @ expression at once, so we actually can raise an >> error for such things. >> > > > Sorry if this is a little off topic. > > But there's still something about the "vector" examples that bugs me, > "matrix at vector" and "vector@@2", keep popping up (this also applies to > the matrix at matrix examples to a lesser extent). > > I'm a little unconformable looking at the shape to to decide what's a > matrix and what's a vector. (Matlab has some problems like this) > > If it only has one or two dimensions it's easy, but I always find that if > I've written code that works for 1 matrix or vector, 5 minutes later I want > it to work for fields of matrices or vectors. If we're just going by shape > there's no way to distinguish between a 2d field of matrices and a 3d field > of vectors. > > I guess this is a repeat of part of what Eelco Hoogendoorn saying a few > posts back > > I was just wondering if anyone sees a place, to get @ a little closer to > Einsum, for some sort of array class that understands the difference > between a 4D array of scalars, a 3D array of vectors, and a 2D array of > matrices... The difference between the axes that broad-cast and the axes > that can sum when you hit them with an @ ... or something like that. > > Just a thought. > > Einsum is fantastic by the way, totally worth learning and using. > > > > > Mark Daoust > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From seb.haase at gmail.com Tue Mar 18 04:46:41 2014 From: seb.haase at gmail.com (Sebastian Haase) Date: Tue, 18 Mar 2014 09:46:41 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Just add one vote: I am for * right association * because 1) I'm thinking of matrix multiplication more like operators, which I also learned to work from right to left and because 2) I would put a vector to the right, which would result in better performance. I don't have an opinion on tight/same/ or weak.... (maybe that means then 'same' because it's easier to remember !?) My two cents, Sebastian Haase On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn wrote: > > > Perhaps this a bit of a thread hyjack; but this discussion got me thinking > about how to arrive > > at a more vectorized/tensorified way of specifying linear algebra > operations, in an elegant manner. > > I probably got a little carried away, but what about this syntax? > > indexing/calling an ndarray with a string returns a TensorExpression object > these TensorExpression objects can be combined into a graph using operator > overloads > and these graphs are translated to calls to BLAS or einsum, as is > appropriate > > > #declare some symbols > i,j,ij,k = 'i','j','ij','k' > #we may force evaluation of a (sub) TensorExpression by calling it > #this is trivial to translate to call to einsum > #but such special cases could be dispatched to BLAS as well > b = (A(ij) * x(j)) (i) > #alternatively, we can predeclare a LHS which is automatically sized later > #note that this translates into the same call as the above; just some > syntactic sugar > b = np.empty(()) > b[i] = A(ij) * x(j) > #more complex TensorExpression graphs of this form are trivial to translate > to a call to einsum as well > a(i)*b(j)*c(k) > #conceptually, there is no need to limit this scheme to multiplications > only! > #although such generalizations would require a more complex execution engine > #however, the revamped nditer should make this quite managable to implement > a(i)*b(j) + c(k) > #if axes strings are omitted, standard numpy broadcasting rules are applied > to the expressiongraph created > #this is identical to a*b+c; except that we have the opportunity to > eliminate temporaries > a()*b()+c() > > > Note that such an approach kills quite some birds with one stone > it allows for the elimination of temporaries along the lines of numexpr > > But if i could write: > > b[i] = A[ij] * x[j] > I would much prefer that over > b = A @ x > even though the latter is shorter > > Now if i had n input and output vectors, it would be easy what to do with > them: > > b[ni] = A[ij] * x[nj] > > As i argued earlier, I much prefer this form of explicitness over > conventions about what constitutes a row or column vector. And vectorization > of linear algebra is a trivial extension in this manner, which in itself is > just a subset of even more general multilinear products, which themselves > are a subset of more general expression involving things other than products > > Its a somewhat ambitious idea, and there are probably reasons why it isnt a > good idea as well, but it does not require python language modifications, > and it does not clash with any other functionality or syntax of numpy, as > far as i can tell. Calling of arrays is not yet defined, and alternatively > array indexing could be overloaded on string type. > > Either way, something to chew on when deciding on the best way to go > forward. > > > > > > On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust wrote: >> >> On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith wrote: >>> >>> >>> But, this is actually a feature! Because obviously what *should* be >>> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ >>> Mat). Both of those answers are terrible; it's just, if you have an >>> ordinary left-/right-associative operator, those are your only >>> options. What *should* be returned is an error. And in this scheme we >>> get to see the whole @ expression at once, so we actually can raise an >>> error for such things. >> >> >> >> Sorry if this is a little off topic. >> >> But there's still something about the "vector" examples that bugs me, >> "matrix at vector" and "vector@@2", keep popping up (this also applies to the >> matrix at matrix examples to a lesser extent). >> >> I'm a little unconformable looking at the shape to to decide what's a >> matrix and what's a vector. (Matlab has some problems like this) >> >> If it only has one or two dimensions it's easy, but I always find that if >> I've written code that works for 1 matrix or vector, 5 minutes later I want >> it to work for fields of matrices or vectors. If we're just going by shape >> there's no way to distinguish between a 2d field of matrices and a 3d field >> of vectors. >> >> I guess this is a repeat of part of what Eelco Hoogendoorn saying a few >> posts back >> >> I was just wondering if anyone sees a place, to get @ a little closer to >> Einsum, for some sort of array class that understands the difference between >> a 4D array of scalars, a 3D array of vectors, and a 2D array of matrices... >> The difference between the axes that broad-cast and the axes that can sum >> when you hit them with an @ ... or something like that. >> >> Just a thought. >> >> Einsum is fantastic by the way, totally worth learning and using. >> >> >> >> >> Mark Daoust >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From robert.kern at gmail.com Tue Mar 18 05:14:18 2014 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Mar 2014 09:14:18 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 12:54 AM, Nathaniel Smith wrote: > On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith wrote: >> Mathematica: instead of having an associativity, a @ b @ c gets >> converted into mdot([a, b, c]) > > So, I've been thinking about this (thanks to @rfateman for pointing it > out), and wondering if Mathematica's approach is worth following up > more. (It would need to make it past python-dev, of course, but worst > case is just that they say no and we're back where we are now, so we > might as well think it through.) I predict with near-certainty that this will be rejected, but that doesn't prevent it from derailing the discussion. This proposal is unlike anything else in Python. Chained comparisons are *not* similar to this proposal. The chaining only happens at the syntax level, not the semantics. `a < b < c` gets compiled down to `a.__lt__(b) and b.__lt__(c)`, not `do_comparison([a, b, c], [lt, lt])`. We have approval for a binary @ operator. Take the win. -- Robert Kern From hoogendoorn.eelco at gmail.com Tue Mar 18 05:50:37 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Tue, 18 Mar 2014 10:50:37 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: To elaborate a little on such a more general and explicit method of specifying linear operations (perhaps 'expressions with named axes' is a good nomer to cover this topic). I think indexing rather than calling is preferable. I worried at first about the performance overhead of checking for strings at every indexing op, but get ndarray__getitem__ is already quite a complex beast anyway, and adding (yet another) type test on its args isn't a significant difference. For those who disagree; we could also approach strings with a 'forgiveness is better then permission' attitude. The general rules could be: if no string args, everything works as normal. In case of string args, we may think of the effect of __getitem__ as indexing with strings replaced by colons first, and then creating a NamedAxisIndexExpression (NAIE), associating the given string label with each corresponding axis. Thus, we can write things like A[0:3,'i'] As some additional rules; string arguments can be 'expanded', the string is split on commas if present, and otherwise split into characters, which are then the axis labels. In expressions, all non-labeled axes are treated in sequential order, similar to the ... construct, and have standard numpy broadcasting semantics. The only problem with [] notation is field name lookup; though I have always felt that tables with named columns should be an ndarray subtype, given their fundamentally different indexing semantics. Realizing the full potential of such an approach would be a complex undertaking, but to start with, a more elegant interface to np.einsum would be rather easy to implement. On Tue, Mar 18, 2014 at 9:46 AM, Sebastian Haase wrote: > Just add one vote: I am for > * right association * > because 1) I'm thinking of matrix multiplication more like operators, > which I also learned to work from right to left and because 2) I would > put a vector to the right, which would result in better performance. > > I don't have an opinion on tight/same/ or weak.... (maybe that means > then 'same' because it's easier to remember !?) > > My two cents, > Sebastian Haase > > > On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn > wrote: > > > > > > Perhaps this a bit of a thread hyjack; but this discussion got me > thinking > > about how to arrive > > > > at a more vectorized/tensorified way of specifying linear algebra > > operations, in an elegant manner. > > > > I probably got a little carried away, but what about this syntax? > > > > indexing/calling an ndarray with a string returns a TensorExpression > object > > these TensorExpression objects can be combined into a graph using > operator > > overloads > > and these graphs are translated to calls to BLAS or einsum, as is > > appropriate > > > > > > #declare some symbols > > i,j,ij,k = 'i','j','ij','k' > > #we may force evaluation of a (sub) TensorExpression by calling it > > #this is trivial to translate to call to einsum > > #but such special cases could be dispatched to BLAS as well > > b = (A(ij) * x(j)) (i) > > #alternatively, we can predeclare a LHS which is automatically sized > later > > #note that this translates into the same call as the above; just some > > syntactic sugar > > b = np.empty(()) > > b[i] = A(ij) * x(j) > > #more complex TensorExpression graphs of this form are trivial to > translate > > to a call to einsum as well > > a(i)*b(j)*c(k) > > #conceptually, there is no need to limit this scheme to multiplications > > only! > > #although such generalizations would require a more complex execution > engine > > #however, the revamped nditer should make this quite managable to > implement > > a(i)*b(j) + c(k) > > #if axes strings are omitted, standard numpy broadcasting rules are > applied > > to the expressiongraph created > > #this is identical to a*b+c; except that we have the opportunity to > > eliminate temporaries > > a()*b()+c() > > > > > > Note that such an approach kills quite some birds with one stone > > it allows for the elimination of temporaries along the lines of numexpr > > > > But if i could write: > > > > b[i] = A[ij] * x[j] > > I would much prefer that over > > b = A @ x > > even though the latter is shorter > > > > Now if i had n input and output vectors, it would be easy what to do with > > them: > > > > b[ni] = A[ij] * x[nj] > > > > As i argued earlier, I much prefer this form of explicitness over > > conventions about what constitutes a row or column vector. And > vectorization > > of linear algebra is a trivial extension in this manner, which in itself > is > > just a subset of even more general multilinear products, which themselves > > are a subset of more general expression involving things other than > products > > > > Its a somewhat ambitious idea, and there are probably reasons why it > isnt a > > good idea as well, but it does not require python language modifications, > > and it does not clash with any other functionality or syntax of numpy, as > > far as i can tell. Calling of arrays is not yet defined, and > alternatively > > array indexing could be overloaded on string type. > > > > Either way, something to chew on when deciding on the best way to go > > forward. > > > > > > > > > > > > On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust > wrote: > >> > >> On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith wrote: > >>> > >>> > >>> But, this is actually a feature! Because obviously what *should* be > >>> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ > >>> Mat). Both of those answers are terrible; it's just, if you have an > >>> ordinary left-/right-associative operator, those are your only > >>> options. What *should* be returned is an error. And in this scheme we > >>> get to see the whole @ expression at once, so we actually can raise an > >>> error for such things. > >> > >> > >> > >> Sorry if this is a little off topic. > >> > >> But there's still something about the "vector" examples that bugs me, > >> "matrix at vector" and "vector@@2", keep popping up (this also applies to > the > >> matrix at matrix examples to a lesser extent). > >> > >> I'm a little unconformable looking at the shape to to decide what's a > >> matrix and what's a vector. (Matlab has some problems like this) > >> > >> If it only has one or two dimensions it's easy, but I always find that > if > >> I've written code that works for 1 matrix or vector, 5 minutes later I > want > >> it to work for fields of matrices or vectors. If we're just going by > shape > >> there's no way to distinguish between a 2d field of matrices and a 3d > field > >> of vectors. > >> > >> I guess this is a repeat of part of what Eelco Hoogendoorn saying a few > >> posts back > >> > >> I was just wondering if anyone sees a place, to get @ a little closer to > >> Einsum, for some sort of array class that understands the difference > between > >> a 4D array of scalars, a 3D array of vectors, and a 2D array of > matrices... > >> The difference between the axes that broad-cast and the axes that can > sum > >> when you hit them with an @ ... or something like that. > >> > >> Just a thought. > >> > >> Einsum is fantastic by the way, totally worth learning and using. > >> > >> > >> > >> > >> Mark Daoust > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at scipy.org > >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > >> > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Tue Mar 18 11:22:59 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 16:22:59 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: *About weak-left.* You need to define a priority of @ the matrix product regarding to * the elementwise product because (A*B)@C <> A*(B at C) : see the example above. I say that also from a mathematical point of view. Using mathematical like notations, Matrix1 * Matrix2 * 3 can be written because (Matrix1 * Matrix2) * 3 = Matrix1 * (Matrix2 * 3). That's why I think that the weak-left is the better choice. *About group implementation.* I think the idea of calculating A at B@C as __atmul__([A,B,C]) is a very good idea because this allows efficient implementations. *---------------------* * [1 2]* *A = [3 4]* * [5 6]* *B = [7 8]* * [a d]* *C = [b c]* *---------------------* *(A*B)@C* *=* *[5 12] [a d]* *[21 32] @ [b c]* *=* *[5a+12b 5d+12c ]* *[21a+32b 21d+32c]* *---------------------* *A*(B at C)* *=* *[1 2] [5a+6b 5d+6c]* *[3 4] * [7a+8b 7d+8c]* *=* *[5a+6b 10d+12c]* *[21a+24b 28d+32c]* 2014-03-18 10:50 GMT+01:00 Eelco Hoogendoorn : > To elaborate a little on such a more general and explicit method of > specifying linear operations (perhaps 'expressions with named axes' is a > good nomer to cover this topic). > > I think indexing rather than calling is preferable. I worried at first > about the performance overhead of checking for strings at every indexing > op, but get ndarray__getitem__ is already quite a complex beast anyway, and > adding (yet another) type test on its args isn't a significant difference. > For those who disagree; we could also approach strings with a 'forgiveness > is better then permission' attitude. > > The general rules could be: if no string args, everything works as normal. > In case of string args, we may think of the effect of __getitem__ as > indexing with strings replaced by colons first, and then creating a > NamedAxisIndexExpression (NAIE), associating the given string label with > each corresponding axis. Thus, we can write things like A[0:3,'i'] > > As some additional rules; string arguments can be 'expanded', the string > is split on commas if present, and otherwise split into characters, which > are then the axis labels. > > In expressions, all non-labeled axes are treated in sequential order, > similar to the ... construct, and have standard numpy broadcasting > semantics. > > The only problem with [] notation is field name lookup; though I have > always felt that tables with named columns should be an ndarray subtype, > given their fundamentally different indexing semantics. > > Realizing the full potential of such an approach would be a complex > undertaking, but to start with, a more elegant interface to np.einsum would > be rather easy to implement. > > > On Tue, Mar 18, 2014 at 9:46 AM, Sebastian Haase wrote: > >> Just add one vote: I am for >> * right association * >> because 1) I'm thinking of matrix multiplication more like operators, >> which I also learned to work from right to left and because 2) I would >> put a vector to the right, which would result in better performance. >> >> I don't have an opinion on tight/same/ or weak.... (maybe that means >> then 'same' because it's easier to remember !?) >> >> My two cents, >> Sebastian Haase >> >> >> On Tue, Mar 18, 2014 at 7:13 AM, Eelco Hoogendoorn >> wrote: >> > >> > >> > Perhaps this a bit of a thread hyjack; but this discussion got me >> thinking >> > about how to arrive >> > >> > at a more vectorized/tensorified way of specifying linear algebra >> > operations, in an elegant manner. >> > >> > I probably got a little carried away, but what about this syntax? >> > >> > indexing/calling an ndarray with a string returns a TensorExpression >> object >> > these TensorExpression objects can be combined into a graph using >> operator >> > overloads >> > and these graphs are translated to calls to BLAS or einsum, as is >> > appropriate >> > >> > >> > #declare some symbols >> > i,j,ij,k = 'i','j','ij','k' >> > #we may force evaluation of a (sub) TensorExpression by calling it >> > #this is trivial to translate to call to einsum >> > #but such special cases could be dispatched to BLAS as well >> > b = (A(ij) * x(j)) (i) >> > #alternatively, we can predeclare a LHS which is automatically sized >> later >> > #note that this translates into the same call as the above; just some >> > syntactic sugar >> > b = np.empty(()) >> > b[i] = A(ij) * x(j) >> > #more complex TensorExpression graphs of this form are trivial to >> translate >> > to a call to einsum as well >> > a(i)*b(j)*c(k) >> > #conceptually, there is no need to limit this scheme to multiplications >> > only! >> > #although such generalizations would require a more complex execution >> engine >> > #however, the revamped nditer should make this quite managable to >> implement >> > a(i)*b(j) + c(k) >> > #if axes strings are omitted, standard numpy broadcasting rules are >> applied >> > to the expressiongraph created >> > #this is identical to a*b+c; except that we have the opportunity to >> > eliminate temporaries >> > a()*b()+c() >> > >> > >> > Note that such an approach kills quite some birds with one stone >> > it allows for the elimination of temporaries along the lines of numexpr >> > >> > But if i could write: >> > >> > b[i] = A[ij] * x[j] >> > I would much prefer that over >> > b = A @ x >> > even though the latter is shorter >> > >> > Now if i had n input and output vectors, it would be easy what to do >> with >> > them: >> > >> > b[ni] = A[ij] * x[nj] >> > >> > As i argued earlier, I much prefer this form of explicitness over >> > conventions about what constitutes a row or column vector. And >> vectorization >> > of linear algebra is a trivial extension in this manner, which in >> itself is >> > just a subset of even more general multilinear products, which >> themselves >> > are a subset of more general expression involving things other than >> products >> > >> > Its a somewhat ambitious idea, and there are probably reasons why it >> isnt a >> > good idea as well, but it does not require python language >> modifications, >> > and it does not clash with any other functionality or syntax of numpy, >> as >> > far as i can tell. Calling of arrays is not yet defined, and >> alternatively >> > array indexing could be overloaded on string type. >> > >> > Either way, something to chew on when deciding on the best way to go >> > forward. >> > >> > >> > >> > >> > >> > On Tue, Mar 18, 2014 at 4:28 AM, Mark Daoust >> wrote: >> >> >> >> On Mon, Mar 17, 2014 at 8:54 PM, Nathaniel Smith >> wrote: >> >>> >> >>> >> >>> But, this is actually a feature! Because obviously what *should* be >> >>> returned in this case is *not* (Mat @ vec) @ Mat, *or* Mat @ (vec @ >> >>> Mat). Both of those answers are terrible; it's just, if you have an >> >>> ordinary left-/right-associative operator, those are your only >> >>> options. What *should* be returned is an error. And in this scheme we >> >>> get to see the whole @ expression at once, so we actually can raise an >> >>> error for such things. >> >> >> >> >> >> >> >> Sorry if this is a little off topic. >> >> >> >> But there's still something about the "vector" examples that bugs me, >> >> "matrix at vector" and "vector@@2", keep popping up (this also applies >> to the >> >> matrix at matrix examples to a lesser extent). >> >> >> >> I'm a little unconformable looking at the shape to to decide what's a >> >> matrix and what's a vector. (Matlab has some problems like this) >> >> >> >> If it only has one or two dimensions it's easy, but I always find that >> if >> >> I've written code that works for 1 matrix or vector, 5 minutes later I >> want >> >> it to work for fields of matrices or vectors. If we're just going by >> shape >> >> there's no way to distinguish between a 2d field of matrices and a 3d >> field >> >> of vectors. >> >> >> >> I guess this is a repeat of part of what Eelco Hoogendoorn saying a few >> >> posts back >> >> >> >> I was just wondering if anyone sees a place, to get @ a little closer >> to >> >> Einsum, for some sort of array class that understands the difference >> between >> >> a 4D array of scalars, a 3D array of vectors, and a 2D array of >> matrices... >> >> The difference between the axes that broad-cast and the axes that can >> sum >> >> when you hit them with an @ ... or something like that. >> >> >> >> Just a thought. >> >> >> >> Einsum is fantastic by the way, totally worth learning and using. >> >> >> >> >> >> >> >> >> >> Mark Daoust >> >> >> >> _______________________________________________ >> >> NumPy-Discussion mailing list >> >> NumPy-Discussion at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Tue Mar 18 11:29:28 2014 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 18 Mar 2014 15:29:28 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 3:22 PM, Christophe Bal wrote: > About weak-left. You need to define a priority of @ the matrix product > regarding to * the elementwise product because (A*B)@C <> A*(B at C) : see the > example above. I say that also from a mathematical point of view. What example above? > Using mathematical like notations, Matrix1 * Matrix2 * 3 can be written > because (Matrix1 * Matrix2) * 3 = Matrix1 * (Matrix2 * 3). This seems to argue against what you just said. > That's why I think that the weak-left is the better choice. But this is true as well: 3 * Matrix1 * Matrix2 = (3 * Matrix1) * Matrix2 = 3 * (Matrix1 * Matrix2) Does that expression argue for tight-left? -- Robert Kern From projetmbc at gmail.com Tue Mar 18 11:37:18 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 16:37:18 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: Strange, Gmail has cut my example. Here it is normally. * [1 2]* *A = [3 4]* * [5 6]* *B = [7 8]* * [a d]* *C = [b c]* *(A*B)@C* *=* *[5 12] [a d]* *[21 32] @ [b c]* *=* *[5a+12b 5d+12c ]* *[21a+32b 21d+32c]* *A*(B at C)* *=* *[1 2] [5a+6b 5d+6c]* *[3 4] * [7a+8b 7d+8c]* *=* *[5a+6b 10d+12c]* *[21a+24b 28d+32c]* 2014-03-18 16:29 GMT+01:00 Robert Kern : > On Tue, Mar 18, 2014 at 3:22 PM, Christophe Bal > wrote: > > About weak-left. You need to define a priority of @ the matrix product > > regarding to * the elementwise product because (A*B)@C <> A*(B at C) : see > the > > example above. I say that also from a mathematical point of view. > > What example above? > > > Using mathematical like notations, Matrix1 * Matrix2 * 3 can be written > > because (Matrix1 * Matrix2) * 3 = Matrix1 * (Matrix2 * 3). > > This seems to argue against what you just said. > > > That's why I think that the weak-left is the better choice. > > But this is true as well: > > 3 * Matrix1 * Matrix2 = (3 * Matrix1) * Matrix2 = 3 * (Matrix1 * Matrix2) > > Does that expression argue for tight-left? > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Tue Mar 18 11:40:33 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 16:40:33 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: When I write "using mathematical like notations...", Matrix1 * Matrix2 is a matrix multiplication. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 18 12:51:32 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 16:51:32 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 9:50 AM, Eelco Hoogendoorn wrote: > To elaborate a little on such a more general and explicit method of > specifying linear operations (perhaps 'expressions with named axes' is a > good nomer to cover this topic). [...] This is a good topic to bring up on numpy-discussion, but maybe you should start a new thread? That way it's both more likely to be noticed by interested parties, and also it will make it easier for me to keep track of what's going on in this thread, which is about a specific concrete decision we need to make ;-). -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Tue Mar 18 12:53:28 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 16:53:28 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 3:22 PM, Christophe Bal wrote: > About weak-left. You need to define a priority of @ the matrix product > regarding to * the elementwise product because (A*B)@C <> A*(B at C) This doesn't follow. (a / b) * c != a / (b * c), but / and * in Python have the same priority. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From jay.bourque at continuum.io Tue Mar 18 13:26:37 2014 From: jay.bourque at continuum.io (Jay Bourque) Date: Tue, 18 Mar 2014 12:26:37 -0500 Subject: [Numpy-discussion] _gufuncs_linalg module Message-ID: I was just about to submit some pull requests for fixes to the _gufuncs_linalg module and discovered that it no longer exists. It looks like it was removed in this commit. Is there any reason why it was removed without any apparent discussion? It looks like it was originally added in this PRafter a looooong discussion. Thanks, -Jay -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Mar 18 13:31:50 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 18 Mar 2014 13:31:50 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I'm still bothered by what Nathaniel mentioned about mixing 1d and 2d arrays >>> c = np.arange(4) >>> a = np.arange(16).reshape(4,4) >>> cc = c[:,None] >>> a.dot(c).dot(c.T) 420 >>> a.dot(c.dot(c.T)) array([[ 0, 14, 28, 42], [ 56, 70, 84, 98], [112, 126, 140, 154], [168, 182, 196, 210]]) >>> a.dot(cc).dot(cc.T) array([[ 0, 14, 28, 42], [ 0, 38, 76, 114], [ 0, 62, 124, 186], [ 0, 86, 172, 258]]) >>> a.dot(cc.dot(cc.T)) array([[ 0, 14, 28, 42], [ 0, 38, 76, 114], [ 0, 62, 124, 186], [ 0, 86, 172, 258]]) hint: >>> c.dot(c.T) 14 and I expect it will be a lot more fun if we mix in some 3d or nd arrays. I think some of the decisions should not be driven by what is the most convenient for the usual cases, but by how easy it is to read your code and find the bugs where we made "silly" mistakes. A biased view from someone who learned how to use numpy and scipy by debugging. Matlab and GAUSS are user friendly, they don't allow for reduced dimension and never steal an axis in a reduce operation. I didn't manage to come up with more difficult examples (and ran out of time) >>> (a.dot(c).dot(c.T)).dot(2*a) Traceback (most recent call last): File "", line 1, in AttributeError: 'numpy.int32' object has no attribute 'dot' If we make too many mistakes, then numpy tells us. In the above examples, the scalar dot matrix would raise according to the PEP. I cannot come up with examples where we mix 3d and 2d and 1d because dot currently doesn't do it. "sane-left" Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Tue Mar 18 13:31:54 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 18:31:54 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: This is a different situation because / is indeed an hidden multiplication : a/b = a*inv(b). The same is true for + and - : a-b=a+opp(b). What I'm saying is that these operations * and / are indeed of the very same j-kind. This is not the same for * and @. 2014-03-18 17:53 GMT+01:00 Nathaniel Smith : > On Tue, Mar 18, 2014 at 3:22 PM, Christophe Bal > wrote: > > About weak-left. You need to define a priority of @ the matrix product > > regarding to * the elementwise product because (A*B)@C <> A*(B at C) > > This doesn't follow. (a / b) * c != a / (b * c), but / and * in > Python have the same priority. > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 18 13:36:32 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 17:36:32 +0000 Subject: [Numpy-discussion] _gufuncs_linalg module In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 5:26 PM, Jay Bourque wrote: > I was just about to submit some pull requests for fixes to the > _gufuncs_linalg module and discovered that it no longer exists. It looks > like it was removed in this commit. Is there any reason why it was removed > without any apparent discussion? It looks like it was originally added in > this PR after a looooong discussion. IIRC the functionality was merged into umath_linalg.c.src? -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From njs at pobox.com Tue Mar 18 15:18:59 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 18 Mar 2014 19:18:59 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On 18 Mar 2014 17:32, "Christophe Bal" wrote: > > This is a different situation because / is indeed an hidden multiplication : a/b = a*inv(b). The same is true for + and - : a-b=a+opp(b). What I'm saying is that these operations * and / are indeed of the very same j-kind. > > This is not the same for * and @. // (floordiv) isn't equivalent to a multiplication, but it is also at the same level. << and >> aren't inverses, but they are at the same level. 'in' and 'is' are not even the same type (they have totally different requirements on their right argument) but they are at the same level. Whatever choice we make needs to be something we can justify, and our justification should probably not imply that all of python's other operators are wrong ;-). -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Tue Mar 18 15:25:51 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 20:25:51 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I think that there is very big misunderstanding. My point of view is both a mathematical and a programmagical one. Le 18 mars 2014 20:20, "Nathaniel Smith" a ?crit : > On 18 Mar 2014 17:32, "Christophe Bal" wrote: > > > > This is a different situation because / is indeed an hidden > multiplication : a/b = a*inv(b). The same is true for + and - : > a-b=a+opp(b). What I'm saying is that these operations * and / are indeed > of the very same j-kind. > > > > This is not the same for * and @. > > // (floordiv) isn't equivalent to a multiplication, but it is also at the > same level. << and >> aren't inverses, but they are at the same level. 'in' > and 'is' are not even the same type (they have totally different > requirements on their right argument) but they are at the same level. > Whatever choice we make needs to be something we can justify, and our > justification should probably not imply that all of python's other > operators are wrong ;-). > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Tue Mar 18 16:04:04 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Tue, 18 Mar 2014 21:04:04 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: I'm not saying that Python choices are wrong, I'm just saying that if * is elementwise product, then (A*B)@C <> A*(B at C) should imply to choose different levels fro * and @. By choosing A*B at C = (A*B)@C, that is a convention, we just say that a human who want to calculate something like A*B at C*D at E@G*H*K will have to start first with the elementwise products and then finish by the matrix product that is to say this human will evaluate (A*B)@(C*D)@E@(G*H*K). Maybe you should argue to calculate first the @-products but this will not be a good choice if * is the product of a scalar with a matrix like in 2*A at 3 *B. *On the other hand, if you calculate from left to right, there will be a lot of isolate @-products to do instead of doing a single one. This will not allow to use the very good group technic you have proposed.* 1) If A*B at C*D at E@G*H*K = (A*B)@(C*D)@E@(G*H*K), you quickly evaluate first X = A*B, Y = C*D and Z = G*H*K, and then you can do an efficient @-product of X, Y and Z. 2) If you calculate from left to right, you will do three @-products on couple without having the possibility to choose the more efficient way to evaluate the @-products. Christophe BAL PS1: // is an approximate calculation of the exact mathematical inversion, so it is not really a counter example. PS2: here is a second time my example showing that (A*B)@C <> A*(B at C). * [1 2]* *A = [3 4]* * [5 6]* *B = [7 8]* * [a d]* *C = [b c]* *(A*B)@C* *=* *[5 12] [a d]* *[21 32] @ [b c]* *=* *[5a+12b 5d+12c ]* *[21a+32b 21d+32c]* *A*(B at C)* *=* *[1 2] [5a+6b 5d+6c]* *[3 4] * [7a+8b 7d+8c]* *=* *[5a+6b 10d+12c]* *[21a+24b 28d+32c]* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jay.bourque at continuum.io Tue Mar 18 16:06:08 2014 From: jay.bourque at continuum.io (Jay Bourque) Date: Tue, 18 Mar 2014 15:06:08 -0500 Subject: [Numpy-discussion] _gufuncs_linalg module In-Reply-To: References: Message-ID: Okay, it looks like the removal was part of this PR, and that PR is referenced from this issuewhich mentions needing more review and tests, and also lists several todo items and open issues. I think that more or less answers my question. -Jay On Tue, Mar 18, 2014 at 12:36 PM, Nathaniel Smith wrote: > On Tue, Mar 18, 2014 at 5:26 PM, Jay Bourque > wrote: > > I was just about to submit some pull requests for fixes to the > > _gufuncs_linalg module and discovered that it no longer exists. It looks > > like it was removed in this commit. Is there any reason why it was > removed > > without any apparent discussion? It looks like it was originally added in > > this PR after a looooong discussion. > > IIRC the functionality was merged into umath_linalg.c.src? > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smudkavi at uwaterloo.ca Tue Mar 18 17:49:41 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Tue, 18 Mar 2014 17:49:41 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) Message-ID: Hey all, It's been a while since the last datetime and timezones discussion thread was visited (linked below): http://thread.gmane.org/gmane.comp.python.numeric.general/53805 It looks like the best approach to follow is the UTC only approach in the linked thread with an optional flag to indicate the timezone (to avoid confusing applications where they don't expect any timezone info). Since this is slightly more useful than having just a naive datetime64 package and would be open to extension if required, it's probably the best way to start improving the datetime64 library. If we do wish to have full timezone support it would very likely lead to performance drops (as reasoned in the thread) and we would need to have a dedicated, maintained tzinfo package, at which point it would make much more sense to just incorporate the pytz library. (I also don't have the expertise to implement this, so I would be unable to help resolve the current logjam) I would like to start writing a NEP for this followed by implementation, however I'm not sure what the format etc. is, could someone direct me to a page where this information is provided? Please let me know if there are any ideas, comments etc. Cheers, Sankarshan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From ondrej.certik at gmail.com Tue Mar 18 18:05:16 2014 From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=) Date: Tue, 18 Mar 2014 16:05:16 -0600 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On Mon, Mar 17, 2014 at 11:30 AM, Fernando Perez wrote: > On Mon, Mar 17, 2014 at 10:01 AM, Aron Ahmadia wrote: >> >> >> On Mon, Mar 17, 2014 at 7:53 AM, Nathaniel Smith wrote: >>> >>> The thread so far, it sounds like the consensus answer is "meh, >>> whatever". So I'm thinking we should just drop @@ from the PEP, and if >>> it turns out that this is a problem we can always revisit it in the >>> ~3.6/3.7 timeframe. >> >> >> +1 from here. > > > +1 too. Absent *clear* enthusiasm and support for new syntax/operators, I > think being conservative and slow is the right approach. Just having @ will > give us data and experience with this space, and it may become clear after > one more cycle that we really need/want @@, or not, as the case may be. But > it's easier to add it later if we really need it than to remove it if it > proves to be a bad idea, so +1 for moving slowly on this. +1. Thanks Nathan for pushing this! Ondrej From jaime.frio at gmail.com Tue Mar 18 18:21:22 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Tue, 18 Mar 2014 15:21:22 -0700 Subject: [Numpy-discussion] PR with changes to triangular array functions Message-ID: I submitted a PR that makes some improvements to the numpy functions dealing with triangular arrays. Aside from a general speed-up of about 2x for most functions, there are some minor changes to the public API. In case anyone is concerned about them, here's a list: * 'np.tri' now accepts a boolean 'invert' kwarg that is equivalent to '1 - np.tri' only faster. * 'np.mask_indices' is no longer used by any of the triangular array functions. While it is part of the public API, it is not even mentioned in the documentation AFAICT. It may be a candidate for deprecation IMO. * 'np.tril_indices' and 'np.triu_indices' now accept an 'm' kwarg to indicate the number of columns of the array, so they are no longer restricted to square arrays. The weird thing is that, to preserve the order of the existing arguments, the signature is '(n, k=0, m=None)', while other similar functions, such as 'np.tri', have signature '(n, m=None, k=0)'. * 'np.triu_indices_from' and 'np.tril_indices_from' now also accept rectangular arrays. The PR can be found here: https://github.com/numpy/numpy/pull/4509 Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 18 18:36:40 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 18 Mar 2014 23:36:40 +0100 Subject: [Numpy-discussion] PR with changes to triangular array functions In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 11:21 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > I submitted a PR that makes some improvements to the numpy functions > dealing with triangular arrays. Aside from a general speed-up of about 2x > for most functions, there are some minor changes to the public API. In case > anyone is concerned about them, here's a list: > Hi Jaime, I have no concerns but do want to say thank you for the excellent summaries of your PRs that you send to the lists. Great to keep everyone who doesn't follow Github activity informed, we should all be doing this more often! Cheers, Ralf > > * 'np.tri' now accepts a boolean 'invert' kwarg that is equivalent to '1 - > np.tri' only faster. > * 'np.mask_indices' is no longer used by any of the triangular array > functions. While it is part of the public API, it is not even mentioned in > the documentation AFAICT. It may be a candidate for deprecation IMO. > * 'np.tril_indices' and 'np.triu_indices' now accept an 'm' kwarg to > indicate the number of columns of the array, so they are no longer > restricted to square arrays. The weird thing is that, to preserve the order > of the existing arguments, the signature is '(n, k=0, m=None)', while other > similar functions, such as 'np.tri', have signature '(n, m=None, k=0)'. > * 'np.triu_indices_from' and 'np.tril_indices_from' now also accept > rectangular arrays. > > The PR can be found here: > https://github.com/numpy/numpy/pull/4509 > > Jaime > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Mar 18 19:17:46 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 18 Mar 2014 16:17:46 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 2:49 PM, Sankarshan Mudkavi wrote: > It's been a while since the last datetime and timezones discussion thread > was visited (linked below): > > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 > > It looks like the best approach to follow is the UTC only approach in the > linked thread with an optional flag to indicate the timezone (to avoid > confusing applications where they don't expect any timezone info). Since > this is slightly more useful than having just a naive datetime64 package > and would be open to extension if required, it's probably the best way to > start improving the datetime64 library. > IIUC, I agree -- which is why we need a NEP to specify the details. Thank you for stepping up! If we do wish to have full timezone support it would very likely lead to > performance drops (as reasoned in the thread) and we would need to have a > dedicated, maintained tzinfo package, at which point it would make much > more sense to just incorporate the pytz library. > yup -- there is the option of doing what the stdlib datetime does -- provide a hook to incorporate timezone,s but don't provide an implementation, unless that is a low-level hook that must be implemented in C, it's going to be slow -- slow enough that you might as well use a list of stdlib datetimes.... Also, this has gone far to long without getting fixed -- we need something simple to implement more than anything else. > I would like to start writing a NEP for this followed by implementation, > however I'm not sure what the format etc. is, could someone direct me to a > page where this information is provided? > I don't know that there is such a thing, but you'll find the existing NEPS here: https://github.com/numpy/numpy/tree/master/doc/neps I'd grab one and follow the format. > Please let me know if there are any ideas, comments etc. > Thanks again -- I look forward to seeing it written up, -- I'm sure to have something to say then! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From novin01 at gmail.com Wed Mar 19 05:21:17 2014 From: novin01 at gmail.com (Dave Hirschfeld) Date: Wed, 19 Mar 2014 09:21:17 +0000 (UTC) Subject: [Numpy-discussion] Dates and times and Datetime64 (again) References: Message-ID: Sankarshan Mudkavi uwaterloo.ca> writes: > > Hey all, > It's been a while since the last datetime and timezones discussion thread was visited (linked below): > > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 > > It looks like the best approach to follow is the UTC only approach in the linked thread with an optional flag to indicate the timezone (to avoid confusing applications where they don't expect any timezone info). Since this is slightly more useful than having just a naive datetime64 package and would be open to extension if required, it's probably the best way to start improving the datetime64 library. > > I would like to start writing a NEP for this followed by implementation, however I'm not sure what the format etc. is, could someone direct me to a page where this information is provided? > > Please let me know if there are any ideas, comments etc. > > Cheers, > Sankarshan > See: http://article.gmane.org/gmane.comp.python.numeric.general/55191 You could use a current NEP as a template: https://github.com/numpy/numpy/tree/master/doc/neps I'm a huge +100 on the simplest UTC fix. As is, using numpy datetimes is likely to silently give incorrect results - something I've already seen several times in end-user data analysis code. Concrete Example: In [16]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1] ...: values = np.array([1,2,3]).repeat(24) ...: records = zip(map(str, dates), values) ...: pd.TimeSeries(values, dates).groupby(lambda d: d.date()).mean() ...: Out[16]: 2014-04-01 1 2014-04-02 2 2014-04-03 3 dtype: int32 In [17]: df = pd.DataFrame(np.array(records, dtype=[('dates', 'M8[h]'), ('values', float)])) ...: df.set_index('dates', inplace=True) ...: df.groupby(lambda d: d.date()).mean() ...: Out[17]: values 2014-03-31 1.000000 2014-04-01 1.041667 2014-04-02 2.041667 2014-04-03 3.000000 [4 rows x 1 columns] Try it in your timezone and see what you get! -Dave From jeffreback at gmail.com Wed Mar 19 08:25:39 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Wed, 19 Mar 2014 08:25:39 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: Message-ID: Dave, your example is not a problem with numpy per se, rather that the default generation is in local timezone (same as what python datetime does). If you localize to UTC you get the results that you expect. In [49]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1] In [50]: pd.TimeSeries(values, dates.tz_localize('UTC')).groupby(lambda d: d.date()).mean() Out[50]: 2014-04-01 1 2014-04-02 2 2014-04-03 3 dtype: int64 In [51]: records = zip(map(str, dates.tz_localize('UTC')), values) In [52]: df = pd.DataFrame(np.array(records, dtype=[('dates', 'M8[h]'),('values', float)])) In [53]: df.set_index('dates').groupby(lambda x: x.date()).mean() Out[53]: values 2014-04-01 1 2014-04-02 2 2014-04-03 3 [3 rows x 1 columns] On Wed, Mar 19, 2014 at 5:21 AM, Dave Hirschfeld wrote: > Sankarshan Mudkavi uwaterloo.ca> writes: > > > > > Hey all, > > It's been a while since the last datetime and timezones discussion thread > was visited (linked below): > > > > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 > > > > It looks like the best approach to follow is the UTC only approach in the > linked thread with an optional flag to indicate the timezone (to avoid > confusing applications where they don't expect any timezone info). Since > this is slightly more useful than having just a naive datetime64 package > and > would be open to extension if required, it's probably the best way to start > improving the datetime64 library. > > > > > I would like to start writing a NEP for this followed by implementation, > however I'm not sure what the format etc. is, could someone direct me to a > page where this information is provided? > > > > Please let me know if there are any ideas, comments etc. > > > > Cheers, > > Sankarshan > > > > See: http://article.gmane.org/gmane.comp.python.numeric.general/55191 > > > You could use a current NEP as a template: > https://github.com/numpy/numpy/tree/master/doc/neps > > > I'm a huge +100 on the simplest UTC fix. > > As is, using numpy datetimes is likely to silently give incorrect results - > something I've already seen several times in end-user data analysis code. > > Concrete Example: > > In [16]: dates = pd.date_range('01-Apr-2014', '04-Apr-2014', freq='H')[:-1] > ...: values = np.array([1,2,3]).repeat(24) > ...: records = zip(map(str, dates), values) > ...: pd.TimeSeries(values, dates).groupby(lambda d: d.date()).mean() > ...: > Out[16]: > 2014-04-01 1 > 2014-04-02 2 > 2014-04-03 3 > dtype: int32 > > In [17]: df = pd.DataFrame(np.array(records, dtype=[('dates', 'M8[h]'), > ('values', float)])) > ...: df.set_index('dates', inplace=True) > ...: df.groupby(lambda d: d.date()).mean() > ...: > Out[17]: > values > 2014-03-31 1.000000 > 2014-04-01 1.041667 > 2014-04-02 2.041667 > 2014-04-03 3.000000 > > [4 rows x 1 columns] > > Try it in your timezone and see what you get! > > -Dave > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.h.jaffe at gmail.com Wed Mar 19 09:56:36 2014 From: a.h.jaffe at gmail.com (Andrew Jaffe) Date: Wed, 19 Mar 2014 13:56:36 +0000 Subject: [Numpy-discussion] [RFC] should we argue for a matrix power operator, @@? In-Reply-To: References: Message-ID: On 16/03/2014 01:31, josef.pktd at gmail.com wrote: > > > > On Sat, Mar 15, 2014 at 8:47 PM, Warren Weckesser > > wrote: > > > On Sat, Mar 15, 2014 at 8:38 PM, > wrote: > > I think I wouldn't use anything like @@ often enough to remember > it's meaning. I'd rather see english names for anything that is > not **very** common. > > I find A@@-1 pretty ugly compared to inv(A) > A@@(-0.5) might be nice (do we have matrix_sqrt ?) > > > > scipy.linalg.sqrtm: > http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.sqrtm.html > > > maybe a good example: I could never figured that one out > > M = sqrtm(A) > > A = M @ M > > but what we use in stats is > > A = R.T @ R > (eigenvectors dot diag(sqrt of eigenvalues) > > which sqrt is A@@(0.5) ? > > Josef Agreed- In general, "the matrix square root" isn't a well-defined quantity. For some uses, the Cholesky decomposition is what you want, for some others it's the matrix with the same eigenvectors, but the square root of the eigenvalues, etc. etc. As an important aside, it would be good if the docs addressed this. Yours, Andrew From novin01 at gmail.com Wed Mar 19 10:01:08 2014 From: novin01 at gmail.com (Dave Hirschfeld) Date: Wed, 19 Mar 2014 14:01:08 +0000 (UTC) Subject: [Numpy-discussion] Dates and times and Datetime64 (again) References: Message-ID: Jeff Reback gmail.com> writes: > > Dave, > > your example is not a problem with numpy per se, rather that the default generation is in local timezone (same as what python datetime does). > If you localize to UTC you get the results that you expect.? > The problem is that the default datetime generation in *numpy* is in local time. Note that this *is not* the case in Python - it doesn't try to guess the timezone info based on where in the world you run the code, if it's not provided it sets it to None. In [7]: pd.datetime? Type: type String Form: Docstring: datetime(year, month, day[, hour[, minute[, second[, microsecond[,tzinfo]]]]]) The year, month and day arguments are required. tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be ints or longs. In [8]: pd.datetime(2000,1,1).tzinfo is None Out[8]: True This may be the best solution but as others have pointed out this is more difficult to implement and may have other issues. I don't want to wait for the best solution - the assume UTC on input/output if not specified will solve the problem and this desperately needs to be fixed because it's completely broken as is IMHO. > If you localize to UTC you get the results that you expect. That's the whole point - *numpy* needs to localize to UTC, not to whatever timezone you happen to be in when running the code. In a real-world data analysis problem you don't start with the data in a DataFrame or a numpy array it comes from the web, a csv, Excel, a database and you want to convert it to a DataFrame or numpy array. So what you have from whatever source is a list of tuples of strings and you want to convert them into a typed array. Obviously you can't localize a string - you have to convert it to a date first and if you do that with numpy the date you have is wrong. In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03- 30 02:00'], dtype='M8[h]') ...: dst ...: Out[108]: array(['2014-03-30T00+0000', '2014-03-30T00+0000', '2014-03- 30T02+0100'], dtype='datetime64[h]') In [109]: dst.tolist() Out[109]: [datetime.datetime(2014, 3, 30, 0, 0), datetime.datetime(2014, 3, 30, 0, 0), datetime.datetime(2014, 3, 30, 1, 0)] AFAICS there's no way to get the original dates back once they've passed through numpy's parser!? -Dave From njs at pobox.com Wed Mar 19 14:24:18 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Mar 2014 18:24:18 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Tue, Mar 18, 2014 at 9:14 AM, Robert Kern wrote: > On Tue, Mar 18, 2014 at 12:54 AM, Nathaniel Smith wrote: >> On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith wrote: >>> Mathematica: instead of having an associativity, a @ b @ c gets >>> converted into mdot([a, b, c]) >> >> So, I've been thinking about this (thanks to @rfateman for pointing it >> out), and wondering if Mathematica's approach is worth following up >> more. (It would need to make it past python-dev, of course, but worst >> case is just that they say no and we're back where we are now, so we >> might as well think it through.) > > I predict with near-certainty that this will be rejected, I guess that's what everyone thought about @ too? ;-) > but that > doesn't prevent it from derailing the discussion. This proposal is > unlike anything else in Python. Chained comparisons are *not* similar > to this proposal. The chaining only happens at the syntax level, not > the semantics. `a < b < c` gets compiled down to `a.__lt__(b) and > b.__lt__(c)`, not `do_comparison([a, b, c], [lt, lt])`. Yes, the syntax is the same as chained comparisons, and the dispatch is a generalization of regular operators. It is unusual; OTOH, @ is unusual in that no other operators in Python have the property that evaluating in the wrong order can cost you seconds of time and gigabytes of memory. Perhaps. > We have approval for a binary @ operator. Take the win. We have approval, and we have a request: that we figure out how @ should work in detail to be most useful to us. Maybe that's this proposal; maybe not. Ultimately rejected-or-not-rejected comes down to how strong the arguments for something are. And while we can make some guesses about that, it's impossible to know how strong an argument will be until one sits down and works it out. So I still would like to hear what people think, even if it just ends in the conclusion that it's a terrible idea ;-). As for arguments against the "grouping" semantics, I did think of one another case where @ is not associative, though it's pretty weird: In [9]: a = np.arange(16, dtype=np.int8).reshape((4, 4)) In [10]: np.dot(a, np.dot(a, a.astype(float))) Out[10]: array([[ 1680., 1940., 2200., 2460.], [ 4880., 5620., 6360., 7100.], [ 8080., 9300., 10520., 11740.], [ 11280., 12980., 14680., 16380.]]) In [12]: np.dot(np.dot(a, a), a.astype(float)) Out[12]: array([[ 1680., 1940., 2200., 2460.], [-1264., -1548., -1832., -2116.], [ 1936., 2132., 2328., 2524.], [-1008., -1100., -1192., -1284.]]) (What's happening is that we have int8 @ int8 @ float, so (int8 @ int8) @ float has overflows in the first computation, but int8 @ (int8 @ float) does all the computations in float, with no overflows.) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From josef.pktd at gmail.com Wed Mar 19 15:45:03 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 19 Mar 2014 15:45:03 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Wed, Mar 19, 2014 at 2:24 PM, Nathaniel Smith wrote: > On Tue, Mar 18, 2014 at 9:14 AM, Robert Kern > wrote: > > On Tue, Mar 18, 2014 at 12:54 AM, Nathaniel Smith wrote: > >> On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith wrote: > >>> Mathematica: instead of having an associativity, a @ b @ c gets > >>> converted into mdot([a, b, c]) > >> > >> So, I've been thinking about this (thanks to @rfateman for pointing it > >> out), and wondering if Mathematica's approach is worth following up > >> more. (It would need to make it past python-dev, of course, but worst > >> case is just that they say no and we're back where we are now, so we > >> might as well think it through.) > > > > I predict with near-certainty that this will be rejected, > > I guess that's what everyone thought about @ too? ;-) > > > but that > > doesn't prevent it from derailing the discussion. This proposal is > > unlike anything else in Python. Chained comparisons are *not* similar > > to this proposal. The chaining only happens at the syntax level, not > > the semantics. `a < b < c` gets compiled down to `a.__lt__(b) and > > b.__lt__(c)`, not `do_comparison([a, b, c], [lt, lt])`. > > Yes, the syntax is the same as chained comparisons, and the dispatch > is a generalization of regular operators. It is unusual; OTOH, @ is > unusual in that no other operators in Python have the property that > evaluating in the wrong order can cost you seconds of time and > gigabytes of memory. Perhaps. > > > We have approval for a binary @ operator. Take the win. > > We have approval, and we have a request: that we figure out how @ > should work in detail to be most useful to us. Maybe that's this > proposal; maybe not. Ultimately rejected-or-not-rejected comes down to > how strong the arguments for something are. And while we can make some > guesses about that, it's impossible to know how strong an argument > will be until one sits down and works it out. So I still would like to > hear what people think, even if it just ends in the conclusion that > it's a terrible idea ;-). > What happens if you have 5 @ in a row? My head hurts if I had to think about what would actually be going on. and don't forget, the sparse matrix is stuck in the middle. But I would be happy to have a optimizing multi_dot or chain_dot function when it feels safe enough. > > As for arguments against the "grouping" semantics, I did think of one > another case where @ is not associative, though it's pretty weird: > > In [9]: a = np.arange(16, dtype=np.int8).reshape((4, 4)) > > In [10]: np.dot(a, np.dot(a, a.astype(float))) > Out[10]: > array([[ 1680., 1940., 2200., 2460.], > [ 4880., 5620., 6360., 7100.], > [ 8080., 9300., 10520., 11740.], > [ 11280., 12980., 14680., 16380.]]) > > In [12]: np.dot(np.dot(a, a), a.astype(float)) > Out[12]: > array([[ 1680., 1940., 2200., 2460.], > [-1264., -1548., -1832., -2116.], > [ 1936., 2132., 2328., 2524.], > [-1008., -1100., -1192., -1284.]]) > > (What's happening is that we have int8 @ int8 @ float, so (int8 @ > int8) @ float has overflows in the first computation, but int8 @ (int8 > @ float) does all the computations in float, with no overflows.) > That's similar to my example before that mixes in some scalar *. I thought of it as an argument for same-left Josef > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Wed Mar 19 15:45:30 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 19 Mar 2014 19:45:30 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Sat, Mar 15, 2014 at 3:41 AM, Nathaniel Smith wrote: > I think we need to > know something about how often the Mat @ Mat @ vec type cases arise in > practice. How often do non-scalar * and np.dot show up in the same > expression? How often does it look like a * np.dot(b, c), and how often does > it look like np.dot(a * b, c)? How often do we see expressions like > np.dot(np.dot(a, b), c), and how often do we see expressions like np.dot(a, > np.dot(b, c))? This would really help guide the debate. I don't have this > data, and I'm not sure the best way to get it. A super-fancy approach would > be to write a little script that uses the 'ast' module to count things > automatically. A less fancy approach would be to just pick some code you've > written, or a well-known package, grep through for calls to 'dot', and make > notes on what you see. (An advantage of the less-fancy approach is that as a > human you might be able to tell the difference between scalar and non-scalar > *, or check whether it actually matters what order the 'dot' calls are done > in.) Okay, I wrote a little script [1] to scan Python source files look for things like 'dot(a, dot(b, c))' or 'dot(dot(a, b), c)', or the ndarray.dot method equivalents. So what we get out is: - a count of how many 'dot' calls there are - a count of how often we see left-associative nestings: dot(dot(a, b), c) - a count of how often we see right-associative nestings: dot(a, dot(b, c)) Running it on a bunch of projects, I get: | project | dots | left | right | right/left | |--------------+------+------+-------+------------| | scipy | 796 | 53 | 27 | 0.51 | | nipy | 275 | 3 | 19 | 6.33 | | scikit-learn | 472 | 11 | 10 | 0.91 | | statsmodels | 803 | 46 | 38 | 0.83 | | astropy | 17 | 0 | 0 | nan | | scikit-image | 15 | 1 | 0 | 0.00 | |--------------+------+------+-------+------------| | total | 2378 | 114 | 94 | 0.82 | (Any other projects worth trying? This is something that could vary a lot between different projects, so it seems more important to get lots of projects here than to get a few giant projects. Or if anyone wants to run the script on their own private code, please do! Running it on my personal pile of random junk finds 3 left-associative and 1 right.) Two flaws with this approach: 1) Probably some proportion of those nested dot calls are places where it doesn't actually matter which evaluation order one uses -- dot() forces you to pick one, so you have to. If people prefer to, say, use the "left" form in cases where it doesn't matter, then this could bias the left-vs-right results -- hard to say. (Somewhere in this thread it was suggested that the use of the .dot method could create such a bias, because a.dot(b).dot(c) is more natural than a.dot(b.dot(c)), but only something like 6% of the dot calls here use the method form, so this probably doesn't matter.) OTOH, this also means that the total frequency of @ expressions where associativity even matters at all is probably *over*-estimated by the above. 2) This approach misses cases where the cumbersomeness of dot has caused people to introduce temporary variables, like 'foo = np.dot(a, b); bar = np.dot(foo, c)'. So this causes us to *under*-estimate how often associativity matters. I did read through the 'dot' uses in scikit-learn and nipy, though, and only caught a handful of such cases, so I doubt it changes anything much. -n [1] https://gist.github.com/njsmith/9157645#file-grep-dot-dot-py -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smudkavi at uwaterloo.ca Wed Mar 19 22:07:13 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Wed, 19 Mar 2014 22:07:13 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: Message-ID: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> On Mar 19, 2014, at 10:01 AM, Dave Hirschfeld wrote: > Jeff Reback gmail.com> writes: > >> >> Dave, >> >> your example is not a problem with numpy per se, rather that the default > generation is in local timezone (same as what python datetime does). >> If you localize to UTC you get the results that you expect. >> > > The problem is that the default datetime generation in *numpy* is in local > time. > > Note that this *is not* the case in Python - it doesn't try to guess the > timezone info based on where in the world you run the code, if it's not > provided it sets it to None. > > In [7]: pd.datetime? > Type: type > String Form: > Docstring: > datetime(year, month, day[, hour[, minute[, second[, > microsecond[,tzinfo]]]]]) > > The year, month and day arguments are required. tzinfo may be None, or an > instance of a tzinfo subclass. The remaining arguments may be ints or longs. > > In [8]: pd.datetime(2000,1,1).tzinfo is None > Out[8]: True > > > This may be the best solution but as others have pointed out this is more > difficult to implement and may have other issues. > > I don't want to wait for the best solution - the assume UTC on input/output > if not specified will solve the problem and this desperately needs to be > fixed because it's completely broken as is IMHO. > > >> If you localize to UTC you get the results that you expect. > > That's the whole point - *numpy* needs to localize to UTC, not to whatever > timezone you happen to be in when running the code. > > In a real-world data analysis problem you don't start with the data in a > DataFrame or a numpy array it comes from the web, a csv, Excel, a database > and you want to convert it to a DataFrame or numpy array. So what you have > from whatever source is a list of tuples of strings and you want to convert > them into a typed array. > > Obviously you can't localize a string - you have to convert it to a date > first and if you do that with numpy the date you have is wrong. > > In [108]: dst = np.array(['2014-03-30 00:00', '2014-03-30 01:00', '2014-03- > 30 02:00'], dtype='M8[h]') > ...: dst > ...: > Out[108]: array(['2014-03-30T00+0000', '2014-03-30T00+0000', '2014-03- > 30T02+0100'], dtype='datetime64[h]') > > In [109]: dst.tolist() > Out[109]: > [datetime.datetime(2014, 3, 30, 0, 0), > datetime.datetime(2014, 3, 30, 0, 0), > datetime.datetime(2014, 3, 30, 1, 0)] > > > AFAICS there's no way to get the original dates back once they've passed > through numpy's parser!? > > > -Dave > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Hi all, I've written a rather rudimentary NEP, (lacking in technical details which I will hopefully add after some further discussion and receiving clarification/help on this thread). Please let me know how to proceed and what you think should be added to the current proposal (attached to this mail). Here is a rendered version of the same: https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst Cheers, Sankarshan -- Sankarshan Mudkavi Undergraduate in Physics, University of Waterloo www.smudkavi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalke at dalkescientific.com Thu Mar 20 00:01:28 2014 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 20 Mar 2014 05:01:28 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> On Mar 15, 2014, at 4:41 AM, Nathaniel Smith wrote: > OPTION 1 FOR @: ... "same-left" > OPTION 2 FOR @: ... "weak-right" > OPTION 3 FOR @: ... "tight-right" (In addition to more unusual forms, like 'grouping'.) There's another option, which is to "refuse the temptation to guess", and not allow X @ Y @ Z or mixing with any other operators. After all, several have pointed out that it should be in parenthesis anyway, in order to avoid likely confusion. There's even a bit of precedent for something like this in Python: >>> f(1, 2 for i in range(10)) File "", line 1 SyntaxError: Generator expression must be parenthesized if not sole argument I haven't seen this non-associative option come up in the discussion. To be frank though, I don't think this is a good idea, but Nathaniel wrote "In principle the other 2 possible options are ...", so I wanted to mention this for completion. My preference is for same-left. I rarely work with numpy, and it's more likely that I'll see '@' used in a non-numpy context. That is, people in general will see "@" as a sort of free-for-all operator, to use and abuse as they wish. [1] (For example, Pyparsing has a lot of operator overloads to help make a grammar definition, and they make good sense in that context, but '<<' for recursive definitions is perhaps past the edge.) Someone looking at a "@", without any intuition on precedence or associativity of matrix operations in a mathematical package, will have to figure things out from the documentation or (more likely) experimentation. If and when that happens, then > "Same-left" is the easiest to explain and remember, because it's just, "@ acts like * and /". Cheers, Andrew dalke at dalkescientific.com I came up with two possible ways people might (ab)use it: 1) since "@" is server-like, then a service resolver: service = XMLRPCServer @ "http://localhost:1234/endpoint" There's no real need for this, since there are other equally good ways to structure this sort of call. But someone creative might come up with a good example of using '@' to mean some sort of routing. Interestingly, that creative person might prefer right-associative, to support the 'natural' ("janet" @ "moria" @ "uunet" @ uucpserver).send(message) rather than the inverted: (uucpserver @ "uunet" @ "moria" @ "janet").send(message) This would likely fall under the definition of "cute and ignorable". 2) "@" in XPath indicates an attribute. An XML tree API might support something like: tree = load_xml_tree(...) for node in tree.select("//item[@price > 2*@discount]"): print node @ price, node @ discount That might even be a reasonable short-hand, compared to, say, etree's node.attrib["price"] XML doesn't allow nodes as attributes, so that provides no guidance as to what node @ price @ 1950 might mean. From robert.kern at gmail.com Thu Mar 20 05:07:59 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2014 09:07:59 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> Message-ID: On Thu, Mar 20, 2014 at 4:01 AM, Andrew Dalke wrote: > My preference is for same-left. I rarely work with numpy, and it's more > likely that I'll see '@' used in a non-numpy context. That is, people > in general will see "@" as a sort of free-for-all operator, to use and abuse > as they wish. [1] > > (For example, Pyparsing has a lot of operator overloads to help make > a grammar definition, and they make good sense in that context, but '<<' > for recursive definitions is perhaps past the edge.) > > Someone looking at a "@", without any intuition on precedence or > associativity of matrix operations in a mathematical package, will > have to figure things out from the documentation or (more likely) > experimentation. I think the one thing this discussion has settled is that there is *no* common intuition about the precedence or associativity of matrix operations in a mathematical package. :-) I think the operator-overload-as-DSL use cases actually argue somewhat for right-associativity. There is no lack of left-associative operators for these use cases to choose from since they usually don't have numeric or bitwise operations defined for them. Right-associativity adds some diversity into the ecosystem and opens up some design space. -- Robert Kern From njs at pobox.com Thu Mar 20 07:16:41 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Mar 2014 11:16:41 +0000 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On 20 Mar 2014 02:07, "Sankarshan Mudkavi" wrote: > I've written a rather rudimentary NEP, (lacking in technical details which I will hopefully add after some further discussion and receiving clarification/help on this thread). > > Please let me know how to proceed and what you think should be added to the current proposal (attached to this mail). > > Here is a rendered version of the same: > https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst Your NEP suggests making all datetime64s be in UTC, and treating string representations from unknown timezones as UTC. How does this differ from, and why is it superior to, making all datetime64s be naive? -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalke at dalkescientific.com Thu Mar 20 09:10:31 2014 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 20 Mar 2014 14:10:31 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> Message-ID: <772C85C4-83A1-441A-BCC6-A3DCAD96D9C4@dalkescientific.com> On Mar 20, 2014, at 10:07 AM, Robert Kern wrote: > I think the operator-overload-as-DSL use cases actually argue somewhat > for right-associativity. ... Right-associativity adds some diversity > into the ecosystem and opens up some design space. You say that like it's a good thing. My argument is that anything which adds another line to Python's precedence table is a bad idea. Unless there's a really good reason for it. The two examples were the best I could come up with, and I don't think they are that persuasive. Looking at the table, the only places for it are on the *, /, //, % line or on the ** line. Since ** is right-associative, then the diversity argument combined with the "no new line" argument means @ should be on the same line, and with the same precedence, as ** In DSL space, that means @ could be used as the inverse of ** by those who want to discard any ties to its use in numerics. Considering it now, I agree this would indeed open up some design space. I don't see anything disastrously wrong for that in matrix/vector use, though my intuition on this is very limited. I believe this gives results like the "strong right" option, no? As an observation, if this is done, and if you want operators to look symmetrical at the syntax level, and if a matrix exponentiation operator isn't critical, then perhaps use '@@' for matrix multiplication, and leave '@' for decorators? It would be a small reminder that '@@' has higher precedence than '*', and may help reduce the momentary confusion upon seeing something like @a at b @c def f(): pass Of course, "f(*x, **y)" are perfectly good unary-looking uses of '*' and '**' and don't cause confusion with the binary forms, so this is all that strong of an objection. Cheers, Andrew dalke at dalkescientific.com From d.s.seljebotn at astro.uio.no Thu Mar 20 09:26:53 2014 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 20 Mar 2014 14:26:53 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: <532AEC9D.5010808@astro.uio.no> On 03/19/2014 08:45 PM, josef.pktd at gmail.com wrote: > > > > On Wed, Mar 19, 2014 at 2:24 PM, Nathaniel Smith > wrote: > > On Tue, Mar 18, 2014 at 9:14 AM, Robert Kern > wrote: > > On Tue, Mar 18, 2014 at 12:54 AM, Nathaniel Smith > wrote: > >> On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith > wrote: > >>> Mathematica: instead of having an associativity, a @ b @ c gets > >>> converted into mdot([a, b, c]) > >> > >> So, I've been thinking about this (thanks to @rfateman for > pointing it > >> out), and wondering if Mathematica's approach is worth following up > >> more. (It would need to make it past python-dev, of course, but > worst > >> case is just that they say no and we're back where we are now, so we > >> might as well think it through.) > > > > I predict with near-certainty that this will be rejected, > > I guess that's what everyone thought about @ too? ;-) > > > but that > > doesn't prevent it from derailing the discussion. This proposal is > > unlike anything else in Python. Chained comparisons are *not* similar > > to this proposal. The chaining only happens at the syntax level, not > > the semantics. `a < b < c` gets compiled down to `a.__lt__(b) and > > b.__lt__(c)`, not `do_comparison([a, b, c], [lt, lt])`. > > Yes, the syntax is the same as chained comparisons, and the dispatch > is a generalization of regular operators. It is unusual; OTOH, @ is > unusual in that no other operators in Python have the property that > evaluating in the wrong order can cost you seconds of time and > gigabytes of memory. Perhaps. > > > We have approval for a binary @ operator. Take the win. > > We have approval, and we have a request: that we figure out how @ > should work in detail to be most useful to us. Maybe that's this > proposal; maybe not. Ultimately rejected-or-not-rejected comes down to > how strong the arguments for something are. And while we can make some > guesses about that, it's impossible to know how strong an argument > will be until one sits down and works it out. So I still would like to > hear what people think, even if it just ends in the conclusion that > it's a terrible idea ;-). > > > > What happens if you have 5 @ in a row? > > My head hurts if I had to think about what would actually be going on. > and don't forget, the sparse matrix is stuck in the middle. Order-of-matrix-multiplication is literally my textbook example of a dynamic programming problem with complexity O(n^2) where n is number of terms (as in, it's how dynamic programming is introduced in my textbook). I don't think adding sparse or diagonal matrices changes this as long as you only deal with chained @ and make some simple assumptions of the cost of a FLOP in sparse @ dense, sparse @ sparse, dense @ dense, and so on. Where you need anything more than very simple dynamic programming algorithms is when you add + into the mix ("whether to use the distributive rule or not" and so on). I'm positive to the chained @ idea, I think it's the answer to "what we really want". Dag Sverre From robert.kern at gmail.com Thu Mar 20 09:31:42 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2014 13:31:42 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <772C85C4-83A1-441A-BCC6-A3DCAD96D9C4@dalkescientific.com> References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> <772C85C4-83A1-441A-BCC6-A3DCAD96D9C4@dalkescientific.com> Message-ID: On Thu, Mar 20, 2014 at 1:10 PM, Andrew Dalke wrote: > On Mar 20, 2014, at 10:07 AM, Robert Kern wrote: >> I think the operator-overload-as-DSL use cases actually argue somewhat >> for right-associativity. ... Right-associativity adds some diversity >> into the ecosystem and opens up some design space. > > You say that like it's a good thing. Hey, I just want a multiplication operator. You're the one who wants a new DSL toy. ;-) > My argument is that anything which adds another line to Python's > precedence table is a bad idea. Unless there's a really good reason > for it. The two examples were the best I could come up with, and I don't > think they are that persuasive. Really? I mean, |, ^, and & *each* get their own precedence level. I'm not really sure that there is an aversion to adding precedence levels. In fact, it almost seems like the opposite: everything gets its own level unless if they obviously go together. > Looking at the table, the only places for it are on the *, /, //, % > line or on the ** line. Since ** is right-associative, then the > diversity argument combined with the "no new line" argument means > @ should be on the same line, and with the same precedence, as ** > > In DSL space, that means @ could be used as the inverse of ** by those > who want to discard any ties to its use in numerics. Considering it > now, I agree this would indeed open up some design space. > > I don't see anything disastrously wrong for that in matrix/vector use, > though my intuition on this is very limited. I believe this gives > results like the "strong right" option, no? > > > As an observation, if this is done, and if you want operators to look > symmetrical at the syntax level, and if a matrix exponentiation > operator isn't critical, then perhaps use '@@' for matrix multiplication, > and leave '@' for decorators? It would be a small reminder that '@@' > has higher precedence than '*', and may help reduce the momentary > confusion upon seeing something like > > @a at b > @c > def f(): > pass The decorator syntax is deliberately limited to prevent this: you can't put an arbitrary expression after the initiating @, just a (potentially dotted) name that may or may not be called maybe with some arguments. One couldn't do this, for example: [~/scratch] |1> s = """ ..> @a+b ..> def f(): ..> pass ..> """ [~/scratch] |4> compile(s, '', 'exec') File "", line 2 @a+b ^ SyntaxError: invalid syntax -- Robert Kern From d.s.seljebotn at astro.uio.no Thu Mar 20 09:36:41 2014 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 20 Mar 2014 14:36:41 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <532AEC9D.5010808@astro.uio.no> References: <532AEC9D.5010808@astro.uio.no> Message-ID: <532AEEE9.1060903@astro.uio.no> On 03/20/2014 02:26 PM, Dag Sverre Seljebotn wrote: > On 03/19/2014 08:45 PM, josef.pktd at gmail.com wrote: >> >> >> >> On Wed, Mar 19, 2014 at 2:24 PM, Nathaniel Smith > > wrote: >> >> On Tue, Mar 18, 2014 at 9:14 AM, Robert Kern > > wrote: >> > On Tue, Mar 18, 2014 at 12:54 AM, Nathaniel Smith > > wrote: >> >> On Sat, Mar 15, 2014 at 6:28 PM, Nathaniel Smith > > wrote: >> >>> Mathematica: instead of having an associativity, a @ b @ c gets >> >>> converted into mdot([a, b, c]) >> >> >> >> So, I've been thinking about this (thanks to @rfateman for >> pointing it >> >> out), and wondering if Mathematica's approach is worth following up >> >> more. (It would need to make it past python-dev, of course, but >> worst >> >> case is just that they say no and we're back where we are now, so we >> >> might as well think it through.) >> > >> > I predict with near-certainty that this will be rejected, >> >> I guess that's what everyone thought about @ too? ;-) >> >> > but that >> > doesn't prevent it from derailing the discussion. This proposal is >> > unlike anything else in Python. Chained comparisons are *not* similar >> > to this proposal. The chaining only happens at the syntax level, not >> > the semantics. `a < b < c` gets compiled down to `a.__lt__(b) and >> > b.__lt__(c)`, not `do_comparison([a, b, c], [lt, lt])`. >> >> Yes, the syntax is the same as chained comparisons, and the dispatch >> is a generalization of regular operators. It is unusual; OTOH, @ is >> unusual in that no other operators in Python have the property that >> evaluating in the wrong order can cost you seconds of time and >> gigabytes of memory. Perhaps. >> >> > We have approval for a binary @ operator. Take the win. >> >> We have approval, and we have a request: that we figure out how @ >> should work in detail to be most useful to us. Maybe that's this >> proposal; maybe not. Ultimately rejected-or-not-rejected comes down to >> how strong the arguments for something are. And while we can make some >> guesses about that, it's impossible to know how strong an argument >> will be until one sits down and works it out. So I still would like to >> hear what people think, even if it just ends in the conclusion that >> it's a terrible idea ;-). >> >> >> >> What happens if you have 5 @ in a row? >> >> My head hurts if I had to think about what would actually be going on. >> and don't forget, the sparse matrix is stuck in the middle. > > Order-of-matrix-multiplication is literally my textbook example of a > dynamic programming problem with complexity O(n^2) where n is number of > terms (as in, it's how dynamic programming is introduced in my textbook). > > I don't think adding sparse or diagonal matrices changes this as long as > you only deal with chained @ and make some simple assumptions of the > cost of a FLOP in sparse @ dense, sparse @ sparse, dense @ dense, and so on. > > Where you need anything more than very simple dynamic programming > algorithms is when you add + into the mix ("whether to use the > distributive rule or not" and so on). > > I'm positive to the chained @ idea, I think it's the answer to "what we > really want". Sorry, I totally misunderstood this. The question is of course how you dispatch technically (where the __matmul__ function lives and which one to use), not figuring out what you want done. I think you'd need to keep this very simple; for instance, just require the leftmost matrix to implement __matmul__ that takes a list, ditch __rmatmul__, and then solve the rest on the library level. In our case, everyone would delegate __matmul__ to something in NumPy that then supports hooks and solves this on the library level. That would work as I say above + hooks to plug in cost estimators and compute functions for various matrix products. (I've thought too much about these things as I wasted at least a month of my PhD on the now abandoned "oomatrix" project to find the optimal way of computing a linear algebra expressions.) Dag Sverre From smudkavi at uwaterloo.ca Thu Mar 20 09:39:15 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Thu, 20 Mar 2014 09:39:15 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: <77CDF317-1ADD-4DE0-85BA-5CA8E3D38381@uwaterloo.ca> Hi Nathaniel, It differs by allowing time zone info to be preserved if supplied. A naive datetime64 would be unable to handle this, and would either have to ignore the tzinfo or would have to throw up an exception. The current suggestion is very similar to a naive datetime64 and only differs in being able to handle the given tzinfo, rather than ignoring it or telling the user that the current implementation cannot handle it. This would be superioir to a naive dateime64 for use cases that have the tzinfo available, and would avoid the users having to workaround NumPy's inability to handle them if provided. A big thanks to Chris Barker for the write up linked in the proposal, it makes it very clear what the various possibilities are for improvement. Cheers, Sankarshan On Mar 20, 2014, at 7:16 AM, Nathaniel Smith wrote: > On 20 Mar 2014 02:07, "Sankarshan Mudkavi" wrote: > > I've written a rather rudimentary NEP, (lacking in technical details which I will hopefully add after some further discussion and receiving clarification/help on this thread). > > > > Please let me know how to proceed and what you think should be added to the current proposal (attached to this mail). > > > > Here is a rendered version of the same: > > https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst > > Your NEP suggests making all datetime64s be in UTC, and treating string representations from unknown timezones as UTC. How does this differ from, and why is it superior to, making all datetime64s be naive? > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Sankarshan Mudkavi Undergraduate in Physics, University of Waterloo www.smudkavi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From robert.kern at gmail.com Thu Mar 20 09:41:07 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2014 13:41:07 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <532AEEE9.1060903@astro.uio.no> References: <532AEC9D.5010808@astro.uio.no> <532AEEE9.1060903@astro.uio.no> Message-ID: On Thu, Mar 20, 2014 at 1:36 PM, Dag Sverre Seljebotn wrote: > On 03/20/2014 02:26 PM, Dag Sverre Seljebotn wrote: >> I'm positive to the chained @ idea, I think it's the answer to "what we >> really want". > > Sorry, I totally misunderstood this. The question is of course how you > dispatch technically (where the __matmul__ function lives and which one > to use), not figuring out what you want done. > > I think you'd need to keep this very simple; for instance, just require > the leftmost matrix to implement __matmul__ that takes a list, ditch > __rmatmul__, and then solve the rest on the library level. > > In our case, everyone would delegate __matmul__ to something in NumPy > that then supports hooks and solves this on the library level. That > would work as I say above + hooks to plug in cost estimators and compute > functions for various matrix products. To me, that signals that it's time to drop the operator for a library of functions, just like I prefer solve() and company to a matrix division operator. -- Robert Kern From njs at pobox.com Thu Mar 20 10:02:53 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Mar 2014 14:02:53 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> Message-ID: On Thu, Mar 20, 2014 at 9:07 AM, Robert Kern wrote: > I think the operator-overload-as-DSL use cases actually argue somewhat > for right-associativity. There is no lack of left-associative > operators for these use cases to choose from since they usually don't > have numeric or bitwise operations defined for them. > Right-associativity adds some diversity into the ecosystem and opens > up some design space. Whether or not this is true, I think we should assign this argument ~zero weight for purposes of the present discussion. That's because: - We haven't been asked to figure out the best design of @ for Python overall, we've been asked to report back on what design of @ will be best for the numeric community, since that's where we have special expertise that python-dev lacks. Python-dev is entirely capable of then taking our report as input and then having a debate about how much weight to give to these other possible uses. - And anyway, my impression is that python-dev will give these other possible uses ~zero weight anyway -- if they thought random DSL operators were important for their own sake, they would have added @ long ago :-). Maybe if we say "we literally do not care at all what @'s precedence and associativity are", then it will matter as a tie-breaker, but first I don't think it's true that we don't care, and second even if it were then my guess is that the argument for consistency with the other operators would be a stronger tie-breaker. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From cjyxiaodi1 at gmail.com Thu Mar 20 11:04:36 2014 From: cjyxiaodi1 at gmail.com (Chia Jing Yi) Date: Thu, 20 Mar 2014 23:04:36 +0800 Subject: [Numpy-discussion] Numpy Installation Problem Asking Message-ID: Hi, I plan to plot a sashimi plot to view the alternative splicing event of my interested "gene" by using MISO. After follow the following link, http://genes.mit.edu/burgelab/miso/docs/and read through the forum. Unfortunately, I still face some problems to run the complete set of MISO successful. I can run some of the python script but I still fail to run some of it :( It shown the following error message when I try some of the MISO python script (compare_miso.py, run_miso.py, sashimi_plot.py, etc). *.* *.* *.* *ImportError: No module named _ufuncs* I can successful run some of the important MISO script (index_gff.py, sam_to_bam.py). I'm downloading everything by using Python-2.7. I suspect it might due to numpy or scipy installation problem. Unfortunately, I still fail to figure it out :( I have try to use Python-3.3. But it will show the following error message when I try to run all the python script under MISO : *.* *.* *SyntaxError: invalid syntax* I have browse through forum and seqanswer etc. Unfortunately I still fail to figure out why it happen :( I have to manual download all the package and install it separately as I don't have the purposely through access network through server. It is due to the network security of server at University. Thanks and looking forward to hear from any of you. best regards edge -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Mar 20 13:25:06 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Mar 2014 17:25:06 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Wed, Mar 19, 2014 at 7:45 PM, Nathaniel Smith wrote: > Okay, I wrote a little script [1] to scan Python source files look for > things like 'dot(a, dot(b, c))' or 'dot(dot(a, b), c)', or the ndarray.dot > method equivalents. So what we get out is: > - a count of how many 'dot' calls there are > - a count of how often we see left-associative nestings: dot(dot(a, b), c) > - a count of how often we see right-associative nestings: dot(a, dot(b, c)) > > Running it on a bunch of projects, I get: > > | project | dots | left | right | right/left | > |--------------+------+------+-------+------------| > | scipy | 796 | 53 | 27 | 0.51 | > | nipy | 275 | 3 | 19 | 6.33 | > | scikit-learn | 472 | 11 | 10 | 0.91 | > | statsmodels | 803 | 46 | 38 | 0.83 | > | astropy | 17 | 0 | 0 | nan | > | scikit-image | 15 | 1 | 0 | 0.00 | > |--------------+------+------+-------+------------| > | total | 2378 | 114 | 94 | 0.82 | > Another way to visualize this, converting each contiguous "chain" of calls to np.dot into a parenthesized expression, and then counting how often we see each pattern. 1943 (_ @ _) 100 ((_ @ _) @ _) # left 86 (_ @ (_ @ _)) # right 2 (_ @ ((_ @ _) @ _)) 2 (((_ @ _) @ _) @ _) # left 1 ((_ @ (_ @ _)) @ _) 1 ((_ @ _) @ (_ @ _)) 1 (((_ @ _) @ _) @ (_ @ _)) 1 ((_ @ ((_ @ _) @ _)) @ _) 1 ((_ @ _) @ (_ @ (_ @ _))) (This is pooling scipy/nipy/scikit-learn/statsmodels.) I've noted the 3 different patterns that have a consistent associativity. >From this I'm leaning towards the conclusions that: - Expressions with complex parenthesization do happen, but probably not often enough to justify elaborate stuff like my 'chaining' proposal -- only 8.7% of these cases involve more than one @. - There's very little support here for the intuition that right-associativity is more useful than left-associativity on a day-to-day basis. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Mar 20 13:36:10 2014 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 20 Mar 2014 17:36:10 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <532AEEE9.1060903@astro.uio.no> References: <532AEC9D.5010808@astro.uio.no> <532AEEE9.1060903@astro.uio.no> Message-ID: On Thu, Mar 20, 2014 at 1:36 PM, Dag Sverre Seljebotn wrote: > On 03/20/2014 02:26 PM, Dag Sverre Seljebotn wrote: >> Order-of-matrix-multiplication is literally my textbook example of a >> dynamic programming problem with complexity O(n^2) where n is number of >> terms (as in, it's how dynamic programming is introduced in my textbook). >> >> I don't think adding sparse or diagonal matrices changes this as long as >> you only deal with chained @ and make some simple assumptions of the >> cost of a FLOP in sparse @ dense, sparse @ sparse, dense @ dense, and so on. >> >> Where you need anything more than very simple dynamic programming >> algorithms is when you add + into the mix ("whether to use the >> distributive rule or not" and so on). >> >> I'm positive to the chained @ idea, I think it's the answer to "what we >> really want". > > Sorry, I totally misunderstood this. The question is of course how you > dispatch technically (where the __matmul__ function lives and which one > to use), not figuring out what you want done. Or even more specifically, the question is whether getting the chance to use dynamic programming on chains of @'s (and only @'s!) is so valuable that we want to have a special parsing+dispatch rule to allow it. I have to say that after glancing at a few hundred 'dot' calls, I'm not as convinced that this is useful in practice. There are lots of complex expressions out there involving 'dot', and relatively few of them involve long chains of 'dot' calls [1][2]. There are strategies for doing whole-expression optimization that work for more general expressions, not just @ -- e.g. numexpr, numba, theano -- at the cost of a bit more intrusiveness. And as numpy gets better at supporting non-ndarray types, then it'll be easier to seamlessly support low-impact deferred computation APIs like: a, b, c = defer(a, b, c) d = np.sin(a) + a @ b @ c e = d / (a + b + c + d) return force(e) Having a special dispatch for @ would only help with one of the computations here. -n [1] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069565.html [2] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From josef.pktd at gmail.com Thu Mar 20 13:43:37 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 20 Mar 2014 13:43:37 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: Message-ID: On Thu, Mar 20, 2014 at 1:25 PM, Nathaniel Smith wrote: > On Wed, Mar 19, 2014 at 7:45 PM, Nathaniel Smith wrote: > >> Okay, I wrote a little script [1] to scan Python source files look for >> things like 'dot(a, dot(b, c))' or 'dot(dot(a, b), c)', or the ndarray.dot >> method equivalents. So what we get out is: >> - a count of how many 'dot' calls there are >> - a count of how often we see left-associative nestings: dot(dot(a, b), c) >> - a count of how often we see right-associative nestings: dot(a, dot(b, >> c)) >> >> Running it on a bunch of projects, I get: >> >> | project | dots | left | right | right/left | >> |--------------+------+------+-------+------------| >> | scipy | 796 | 53 | 27 | 0.51 | >> | nipy | 275 | 3 | 19 | 6.33 | >> | scikit-learn | 472 | 11 | 10 | 0.91 | >> | statsmodels | 803 | 46 | 38 | 0.83 | >> | astropy | 17 | 0 | 0 | nan | >> | scikit-image | 15 | 1 | 0 | 0.00 | >> |--------------+------+------+-------+------------| >> | total | 2378 | 114 | 94 | 0.82 | >> > > Another way to visualize this, converting each contiguous "chain" of calls > to np.dot into a parenthesized expression, and then counting how often we > see each pattern. > > 1943 (_ @ _) > 100 ((_ @ _) @ _) # left > 86 (_ @ (_ @ _)) # right > 2 (_ @ ((_ @ _) @ _)) > 2 (((_ @ _) @ _) @ _) # left > 1 ((_ @ (_ @ _)) @ _) > 1 ((_ @ _) @ (_ @ _)) > 1 (((_ @ _) @ _) @ (_ @ _)) > 1 ((_ @ ((_ @ _) @ _)) @ _) > 1 ((_ @ _) @ (_ @ (_ @ _))) > > (This is pooling scipy/nipy/scikit-learn/statsmodels.) I've noted the 3 > different patterns that have a consistent associativity. > > From this I'm leaning towards the conclusions that: > > - Expressions with complex parenthesization do happen, but probably not > often enough to justify elaborate stuff like my 'chaining' proposal -- only > 8.7% of these cases involve more than one @. > just for statsmodels We do have a very large amount of chaining, but in many cases this has been taken out of a single expression into a temporary or permanent variable for parts of the chain. (similar to the quadratic form example in the PEP), either for clarity (a temp variable), or because one dot product shows up several times in the same expression (quadratic forms) or because we need to keep it around for reuse in other expressions. That's what I tried to explain before, that chaining and breaking up larger multi-dot expressions is most of the time a intentional choice and not just random because the the dot function forces us. The most convincing argument for me for @ is that it makes parenthesis visible (until I realized that I didn't really care about @). This reduces the cases where we separate out a dot product for clarity and readibility, but still leaves us with the other two cases, where our chaining won't change whatever numpy provides additionally. Josef > > - There's very little support here for the intuition that > right-associativity is more useful than left-associativity on a day-to-day > basis. > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Thu Mar 20 15:37:58 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 20 Mar 2014 20:37:58 +0100 Subject: [Numpy-discussion] Numpy Installation Problem Asking In-Reply-To: References: Message-ID: On Thu, Mar 20, 2014 at 4:04 PM, Chia Jing Yi wrote: > Hi, > > I plan to plot a sashimi plot to view the alternative splicing event of my > interested "gene" by using MISO. > > After follow the following link, http://genes.mit.edu/burgelab/miso/docs/and read through the forum. > Unfortunately, I still face some problems to run the complete set of MISO > successful. > > I can run some of the python script but I still fail to run some of it :( > It shown the following error message when I try some of the MISO python > script (compare_miso.py, run_miso.py, sashimi_plot.py, etc). > *.* > *.* > *.* > *ImportError: No module named _ufuncs* > > I can successful run some of the important MISO script (index_gff.py, > sam_to_bam.py). > > I'm downloading everything by using Python-2.7. > I suspect it might due to numpy or scipy installation problem. > Unfortunately, I still fail to figure it out :( > I have try to use Python-3.3. > But it will show the following error message when I try to run all the > python script under MISO : > *.* > *.* > *SyntaxError: invalid syntax* > > I have browse through forum and seqanswer etc. > Unfortunately I still fail to figure out why it happen :( > I have to manual download all the package and install it separately as I > don't have the purposely through access network through server. > It is due to the network security of server at University. > > Thanks and looking forward to hear from any of you. > Hi, you have to give a full traceback otherwise we can't help you. And please don't double post, you sent the same message to scipy-user. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From stsci.perry at gmail.com Thu Mar 20 16:33:37 2014 From: stsci.perry at gmail.com (Perry Greenfield) Date: Thu, 20 Mar 2014 16:33:37 -0400 Subject: [Numpy-discussion] OT: job opening at STScI Message-ID: <2E4C2D69-38FF-4B83-992B-AB40F10A604E@gmail.com> The Science Software Branch at the Space Telescope Science Institute is seeking a developer that enjoys and is good at digging into and understanding the details of Python internals and its libraries--particularly those related to scientific computing--in order to support of our development of software tools for Python and the astronomical community, particularly for the Hubble Space Telescope and the next telescope under construction (JWST). Details on the position and how to apply can be found at this link: https://rn11.ultipro.com/SPA1004/JobBoard/JobDetails.aspx?__ID=*5ECC2DFF67015263 (applications are still being taken despite the date given?) Perry Greenfield From dalke at dalkescientific.com Thu Mar 20 16:38:01 2014 From: dalke at dalkescientific.com (Andrew Dalke) Date: Thu, 20 Mar 2014 21:38:01 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> Message-ID: <5017DB2D-9CB1-4DDC-9F08-F6F768FC74B5@dalkescientific.com> On Mar 20, 2014, at 3:02 PM, Nathaniel Smith wrote: > - And anyway, my impression is that python-dev will give these other > possible uses ~zero weight anyway -- if they thought random DSL > operators were important for their own sake, they would have added @ > long ago :-). Unlike what you all seem to think, I *don't* want '@' as a DSL. As I said, I'm a same-left person. There's very little additional power in same-left for a DSL, over the other 4 available operators. I think that weakness is a good thing. I'm saying that it will be (ab)used in a DSL. Given no strong bias one way or another, I prefer to minimize any temptations or weird effects that a new precedence level might cause. Hence why I prefer that it act either the same as "*" or as "**", and to keep it from being more widely used, I more strongly prefer it the same as "*". You say "we've been asked to report back on what design of @ will be best for the numeric community, since that's where we have special expertise that python-dev lacks." I don't really think that goal means you can avoid considering all non-numeric consequences. Nor do I think I'm making any weighty arguments here. My observation though is that the signal so far is low enough that second-order effects are likely to have some influence in the final decision, either here or with python-dev. On Mar 20, 2014, at 2:31 PM, Robert Kern wrote: > Really? I mean, |, ^, and & *each* get their own precedence level. I'm > not really sure that there is an aversion to adding precedence levels. > In fact, it almost seems like the opposite: everything gets its own > level unless if they obviously go together. No new expression precedence level has been added since ** in January 1996, at least, not that I can identify. Those boolean ones you mentioned were added in October 1991 and follows the C operator precedence. Given the number of other languages do the same, and Python's goals, this makes sense. The ** was a secondary influence by Fortran. The closest Python came since 1996 is support for the if/else expression, but that has the same precedence as lambda. Eg, compare 3.4: test: or_test ['if' or_test 'else' test] | lambdef to 1.5.2: test: and_test ('or' and_test)* | lambdef (The documentation says they are at different levels, but they actually are at the same. Consider: lambda: 1 if print('hi') else print('bye') This is not the same as (lambda: 1) if print('hi') else print('bye') .) If you came from a Pascal background then you would expect '+' '-' 'or' and 'xor' to have the same precedence, since the adding operators obviously go together. While, oddly enough, C's == is not at the same level as <=, which suggests that the Dennis Ritchie at one time didn't think it was obvious that those two go together. (Then again, he also started off with a =- 1 instead of a -= 1.) The arguments against a complex precedence table are well-trod ground. See: http://stackoverflow.com/questions/6320424/ http://c2.com/cgi/wiki?OperatorPrecedenceConsideredHarmful Larry Wall's summary about operator precedence in http://perl6.org/archive/doc/design/apo/A03.html is apropos to the general topic of deciding upon operator precedence. Cheers, Andrew dalke at dalkescientific.com From robert.kern at gmail.com Thu Mar 20 17:39:37 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 20 Mar 2014 21:39:37 +0000 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <5017DB2D-9CB1-4DDC-9F08-F6F768FC74B5@dalkescientific.com> References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> <5017DB2D-9CB1-4DDC-9F08-F6F768FC74B5@dalkescientific.com> Message-ID: On Thu, Mar 20, 2014 at 8:38 PM, Andrew Dalke wrote: > You say "we've been asked to report back on what design of @ will > be best for the numeric community, since that's where we have special > expertise that python-dev lacks." I don't really think that goal > means you can avoid considering all non-numeric consequences. Sure, but that discussion will (and should) happen on python-ideas. When Nathaniel says that "we have been asked" to answer this very specific question, he means that literally. Guido has asked us to answer this specific question, not for our community's collective judgement call on the total question. Our answer will be considered along with other concerns (like the operator abuse-case) on python-ideas to end up at the final decision weighing all of the factors. We're trying to avoid the Abilene Paradox. > On Mar 20, 2014, at 2:31 PM, Robert Kern wrote: >> Really? I mean, |, ^, and & *each* get their own precedence level. I'm >> not really sure that there is an aversion to adding precedence levels. >> In fact, it almost seems like the opposite: everything gets its own >> level unless if they obviously go together. > > No new expression precedence level has been added since ** in > January 1996, at least, not that I can identify. Those boolean ones > you mentioned were added in October 1991 and follows the C operator > precedence. Given the number of other languages do the same, and > Python's goals, this makes sense. The ** was a secondary influence > by Fortran. > > The closest Python came since 1996 is support for the if/else > expression, but that has the same precedence as lambda. Eg, > compare 3.4: > > test: or_test ['if' or_test 'else' test] | lambdef > > to 1.5.2: > > test: and_test ('or' and_test)* | lambdef > > (The documentation says they are at different levels, but > they actually are at the same. Consider: > lambda: 1 if print('hi') else print('bye') > This is not the same as > (lambda: 1) if print('hi') else print('bye') > .) The observed parse is consistent with `if-else` having a higher precedence than lambda, as documented. If they *were* at the same level, the parse would be the paranthesized expression that you cite (just as 5%2*3 == (5%2)*3 != 5%(2*3)). > If you came from a Pascal background then you would expect > '+' '-' 'or' and 'xor' to have the same precedence, since > the adding operators obviously go together. > > While, oddly enough, C's == is not at the same level as <=, > which suggests that the Dennis Ritchie at one time didn't > think it was obvious that those two go together. (Then again, > he also started off with a =- 1 instead of a -= 1.) > > The arguments against a complex precedence table are > well-trod ground. See: > http://stackoverflow.com/questions/6320424/ > http://c2.com/cgi/wiki?OperatorPrecedenceConsideredHarmful > > Larry Wall's summary about operator precedence in > http://perl6.org/archive/doc/design/apo/A03.html is apropos > to the general topic of deciding upon operator precedence. I see a lot of assertions of fact there, but no empirical studies that actually make them facts. On that note, I wonder if we can convince Stefik and Siebert et al. to run a study for us to help answer these questions. If I were in their group, I'd jump at the chance to use their empirical methodology to help design an actual production language. http://neverworkintheory.org/2014/01/29/stefik-siebert-syntax.html -- Robert Kern From ndarray at mac.com Thu Mar 20 18:17:46 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Thu, 20 Mar 2014 18:17:46 -0400 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: <772C85C4-83A1-441A-BCC6-A3DCAD96D9C4@dalkescientific.com> References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> <772C85C4-83A1-441A-BCC6-A3DCAD96D9C4@dalkescientific.com> Message-ID: On Thu, Mar 20, 2014 at 9:10 AM, Andrew Dalke wrote: > In DSL space, that means @ could be used as the inverse of ** by those > who want to discard any ties to its use in numerics. Considering it > now, I agree this would indeed open up some design space. > > I don't see anything disastrously wrong for that in matrix/vector use, > though my intuition on this is very limited. I believe this gives > results like the "strong right" option, no? > It is not uncommon to have v**2 @ u in numerical code for a weighted sum of u with weights from v-squared. Under @ in the same line as **, this will be interpreted as v ** (2 @ u) and most likely be an error. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Mar 20 19:27:28 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 20 Mar 2014 16:27:28 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith wrote: > Your NEP suggests making all datetime64s be in UTC, and treating string > representations from unknown timezones as UTC. How does this differ from, > and why is it superior to, making all datetime64s be naive? > > This came up in the conversation before -- I think the fact is that a 'naive' datetime and a UTC datetime are almost exactly the same. In essence you can use a UTC datetime and pretend it's naive in almost all cases. The difference comes down to I/O. If it's UTC, then an ISO 8601 string created from it would include a "Z" on the end (or a +0.00, I think), whereas naive datetime should have no TZ indicator. On input, the question is what do you do with an ISO string with a TZ indicator: 1) translate to UTC -- make sense is we have the "always UTC" definition 2) raise an exception -- makes sense if we have the naive definition 3) ignore it -- which would make some sense if were naive, but perhaps a little too prone to error. But the real issue with the current implementation is how an iso string with no TZ indicator is handled -- it currently assumes that means "use the localle TZ", which is more than not wrong, and clearly subject to errors. Also, it time-shifts to locale TZ when creating an ISO string, with no way to specify that. So: * I'm not sure what the new NEP is suggesting at all, actually, we need a fully description, with exampel sof what varios input / ouput would give. * I think there are more or less three options: 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always b) don't have any timezone handling at all -- all datetime64s are naive (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) 3) Full on proper TZ handling. I think (3) is off the table for now. I think (2) is what the NEP proposes, but I'd need more details, examples to know. I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too. Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. I haven't thought that out for the inevitable complications, though. -CHB > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From smudkavi at uwaterloo.ca Thu Mar 20 19:55:28 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Thu, 20 Mar 2014 19:55:28 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: <8667C894-3EC0-4344-A3F2-0C150D4FBD7C@uwaterloo.ca> Hi Chris, > I think there are more or less three options: > 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always > b) don't have any timezone handling at all -- all datetime64s are naive > (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) > 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) > 3) Full on proper TZ handling. > > I think (3) is off the table for now. > > I think (2) is what the NEP proposes, but I'd need more details, examples to know. > > I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too. Yes 2) is indeed what I was suggesting. My apologies for being unclear, I was unsure of how much detail and technical information I should include in the proposal. I will update it and add more examples etc. to actually specify what I mean. I'm not sure how much of a hit the performance would take if we were to take of the Z handler. Do you have any major concerns as of now regarding that, or do you want to wait till I provide more specific details? It also looks like the last option you mentioned seems quite reasonable too. To only do what ISO 8601 does. Perhaps, it would be better to implement that first and then look for an improvement later on? Do you have a preference for this or the option 2) ? I will expand the NEP and hopefully make it clearer what it entails. Once again, thanks for the earlier write up. Cheers, Sankarshan On Mar 20, 2014, at 7:27 PM, Chris Barker wrote: > On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith wrote: > Your NEP suggests making all datetime64s be in UTC, and treating string representations from unknown timezones as UTC. How does this differ from, and why is it superior to, making all datetime64s be naive? > > This came up in the conversation before -- I think the fact is that a 'naive' datetime and a UTC datetime are almost exactly the same. In essence you can use a UTC datetime and pretend it's naive in almost all cases. > > The difference comes down to I/O. If it's UTC, then an ISO 8601 string created from it would include a "Z" on the end (or a +0.00, I think), whereas naive datetime should have no TZ indicator. > > On input, the question is what do you do with an ISO string with a TZ indicator: > 1) translate to UTC -- make sense is we have the "always UTC" definition > 2) raise an exception -- makes sense if we have the naive definition > 3) ignore it -- which would make some sense if were naive, but perhaps a little too prone to error. > > > But the real issue with the current implementation is how an iso string with no TZ indicator is handled -- it currently assumes that means "use the localle TZ", which is more than not wrong, and clearly subject to errors. > > Also, it time-shifts to locale TZ when creating an ISO string, with no way to specify that. > > So: > > * I'm not sure what the new NEP is suggesting at all, actually, we need a fully description, with exampel sof what varios input / ouput would give. > > * I think there are more or less three options: > 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always > b) don't have any timezone handling at all -- all datetime64s are naive > (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) > 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) > 3) Full on proper TZ handling. > > I think (3) is off the table for now. > > I think (2) is what the NEP proposes, but I'd need more details, examples to know. > > I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too. > > Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. > > I haven't thought that out for the inevitable complications, though. > > -CHB > > > > > > > > > > > > > > > > > > > > > -n > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Sankarshan Mudkavi Undergraduate in Physics, University of Waterloo www.smudkavi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalke at dalkescientific.com Thu Mar 20 20:18:25 2014 From: dalke at dalkescientific.com (Andrew Dalke) Date: Fri, 21 Mar 2014 01:18:25 +0100 Subject: [Numpy-discussion] [help needed] associativity and precedence of '@' In-Reply-To: References: <14291F69-C4B0-41CC-AEA5-9ABC55E5198B@dalkescientific.com> <5017DB2D-9CB1-4DDC-9F08-F6F768FC74B5@dalkescientific.com> Message-ID: On Mar 20, 2014, at 10:39 PM, Robert Kern wrote: > Sure, but that discussion will (and should) happen on python-ideas. > When Nathaniel says that "we have been asked" to answer this very > specific question, he means that literally. Ah, now I understand. Thanks! Andrew dalke at dalkescientific.com From ndarray at mac.com Thu Mar 20 20:53:10 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Thu, 20 Mar 2014 20:53:10 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 7:16 AM, Nathaniel Smith wrote: > Your NEP suggests making all datetime64s be in UTC, and treating string > representations from unknown timezones as UTC. I recall that it was at some point suggested that epoch be part of dtype. I was not able to find the reasons for a rejection, but it would make perfect sense to keep timezone offset in dtype and treat it effectively as an alternative epoch. The way I like to think about datetime is that YYYY-MM-DD hh:mm:ss.nnn is just a fancy way to represent numbers which is more convoluted than decimal notation, but conceptually not so different. So different units, epochs or timezones are just different ways to convert an abstract notion of a point in time to a specific series of bits inside an array. This is what dtype is for - a description of how abstract numbers are stored in memory. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Thu Mar 20 21:11:25 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Thu, 20 Mar 2014 21:11:25 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: <77CDF317-1ADD-4DE0-85BA-5CA8E3D38381@uwaterloo.ca> References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <77CDF317-1ADD-4DE0-85BA-5CA8E3D38381@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 9:39 AM, Sankarshan Mudkavi wrote: > A naive datetime64 would be unable to handle this, and would either have > to ignore the tzinfo or would have to throw up an exception. This is not true. Python's own datetime has no problem handling this: >>> t1 = datetime(2000,1,1,12) >>> t2 = datetime(2000,1,1,12,tzinfo=timezone.utc) >>> print(t1) 2000-01-01 12:00:00 >>> print(t2) 2000-01-01 12:00:00+00:00 -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjyxiaodi1 at gmail.com Thu Mar 20 21:20:17 2014 From: cjyxiaodi1 at gmail.com (Chia Jing Yi) Date: Fri, 21 Mar 2014 09:20:17 +0800 Subject: [Numpy-discussion] Numpy Installation Problem Asking In-Reply-To: References: Message-ID: Hi, Thanks a lot for your email. I will upload the full traceback soon. Sorry for posting the same thread at scipy-user too. I don't aware about that numpy and scipy user group is linked. I apologize for my mistakes. I will be more careful in future. best regards Edge On Fri, Mar 21, 2014 at 3:37 AM, Ralf Gommers wrote: > > > > On Thu, Mar 20, 2014 at 4:04 PM, Chia Jing Yi wrote: > >> Hi, >> >> I plan to plot a sashimi plot to view the alternative splicing event of >> my interested "gene" by using MISO. >> >> After follow the following link, http://genes.mit.edu/burgelab/miso/docs/and read through the forum. >> Unfortunately, I still face some problems to run the complete set of MISO >> successful. >> >> I can run some of the python script but I still fail to run some of it :( >> It shown the following error message when I try some of the MISO python >> script (compare_miso.py, run_miso.py, sashimi_plot.py, etc). >> *.* >> *.* >> *.* >> *ImportError: No module named _ufuncs* >> >> I can successful run some of the important MISO script (index_gff.py, >> sam_to_bam.py). >> >> I'm downloading everything by using Python-2.7. >> I suspect it might due to numpy or scipy installation problem. >> Unfortunately, I still fail to figure it out :( >> I have try to use Python-3.3. >> But it will show the following error message when I try to run all the >> python script under MISO : >> *.* >> *.* >> *SyntaxError: invalid syntax* >> >> I have browse through forum and seqanswer etc. >> Unfortunately I still fail to figure out why it happen :( >> I have to manual download all the package and install it separately as I >> don't have the purposely through access network through server. >> It is due to the network security of server at University. >> >> Thanks and looking forward to hear from any of you. >> > > Hi, you have to give a full traceback otherwise we can't help you. And > please don't double post, you sent the same message to scipy-user. > > Ralf > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Thu Mar 20 21:32:13 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Thu, 20 Mar 2014 21:32:13 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 7:27 PM, Chris Barker wrote: > On Thu, Mar 20, 2014 at 4:16 AM, Nathaniel Smith wrote: > >> Your NEP suggests making all datetime64s be in UTC, and treating string >> representations from unknown timezones as UTC. How does this differ from, >> and why is it superior to, making all datetime64s be naive? >> >> This came up in the conversation before -- I think the fact is that a > 'naive' datetime and a UTC datetime are almost exactly the same. In essence > you can use a UTC datetime and pretend it's naive in almost all cases. > > The difference comes down to I/O. > It is more than I/O. It is also about interoperability with Python's datetime module. Here is the behavior that I don't like in the current implementation: >>> d = array(['2001-01-01T12:00'], dtype='M8[ms]') >>> d.item(0) datetime.datetime(2001, 1, 1, 17, 0) If I understand NEP correctly, the proposal is to make d.item(0) return >>> d.item(0).replace(tzinfo=timezone.utc) datetime.datetime(2001, 1, 1, 12, 0, tzinfo=datetime.timezone.utc) instead. But this is not what I would expect: I want >>> d.item(0) datetime.datetime(2001, 1, 1, 12, 0) When I work with naive datetime objects I don't want to be exposed to timezones at all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Mar 21 17:12:27 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 21 Mar 2014 14:12:27 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: <8667C894-3EC0-4344-A3F2-0C150D4FBD7C@uwaterloo.ca> References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <8667C894-3EC0-4344-A3F2-0C150D4FBD7C@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 4:55 PM, Sankarshan Mudkavi wrote: > Yes 2) is indeed what I was suggesting. My apologies for being unclear, I > was unsure of how much detail and technical information I should include in > the proposal. > well, you need to put enough in that it's clear what it means. I think examples are critical -- at least that's how I learn things. > I'm not sure how much of a hit the performance would take if we were to > take of the Z handler. Do you have any major concerns as of now regarding > that, or do you want to wait till I provide more specific details? > more detail would be good. My comment about performance is that if numpy needs to call a Python object to do the time zone handling for each value in an array, that is going to pretty slow -- but maybe better than not having it at all. And there shouldn't be any reason not to have a fast path for when the array is naive or you are working with two arrays that are in the same TZ -- the really common case that we care about performance for. So ot probably comes down to one extra field... It also looks like the last option you mentioned seems quite reasonable > too. To only do what ISO 8601 does. Perhaps, it would be better to > implement that first and then look for an improvement later on? Do you have > a preference for this or the option 2) ? > I'm liking that one: It seems pretty easy to allow a tag for TZ offset, and not much extra math when converting. And this could be pretty useful. But I'm not writing the code... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Mar 21 17:18:52 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 21 Mar 2014 14:18:52 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 5:53 PM, Alexander Belopolsky wrote: > I recall that it was at some point suggested that epoch be part of dtype. > I was not able to find the reasons for a rejection, > I don't think it was rejected, it just wasn't adopted by anyone to write a NEP and write the code... I actually think it's silly to allow changing the units without changing the epoch. But the pre-defined epoch works fine for all my use cass, so I'm not going to push that. I also did think it was a separate issue that timezones, and thus shouldn't clutter up the NEP (though one someone is opening the code, it would be a good time to do it..) but it would make perfect sense to keep timezone offset in dtype and treat > it effectively as an alternative epoch. > Hmm -- good point -- if we had a dynamic epoch you could just sift that to account for the time zone offset. Though I think that's an implementation issue. The way I like to think about datetime is that YYYY-MM-DD hh:mm:ss.nnn is > just a fancy way to represent numbers which is more convoluted than decimal > notation, but conceptually not so different. So different units, epochs or > timezones are just different ways to convert an abstract notion of a point > in time to a specific series of bits inside an array. This is what dtype > is for - a description of how abstract numbers are stored in memory. > yes -- and also how to convert to/from other types -- which is where the trick is here. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Mar 21 17:31:45 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 21 Mar 2014 14:31:45 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 6:32 PM, Alexander Belopolsky wrote: > The difference comes down to I/O. > > It is more than I/O. It is also about interoperability with Python's > datetime module. > Sorry -- I was using I/O to mean "converting to/from datetime64 and other types" So that included datetime.datetime. Here is the behavior that I don't like in the current implementation: > > >>> d = array(['2001-01-01T12:00'], dtype='M8[ms]') > >>> d.item(0) > datetime.datetime(2001, 1, 1, 17, 0) > yup , it converted to UTC using your locale setting -- really not good! Then tossed that our when creating a datetime.datetime. This really is quite broken. But this brings up a good point -- having time zone handling fully compatible ith datetime.datetime would have its advantages. So use the same tzinfo API. If I understand NEP correctly, the proposal is to make d.item(0) return > > >>> d.item(0).replace(tzinfo=timezone.utc) > datetime.datetime(2001, 1, 1, 12, 0, tzinfo=datetime.timezone.utc) > > instead. But this is not what I would expect: I want > > >>> d.item(0) > datetime.datetime(2001, 1, 1, 12, 0) > > When I work with naive datetime objects I don't want to be exposed to > timezones at all. > right -- naive time zones really would be good. The problem now with the current code and your example, is that in: >>> d = array(['2001-01-01T12:00'], dtype='M8[ms]') '2001-01-01T12:00' is interpreted as meaning "in the machines locale time zone" combining that with teh UTC assumption, and you have trouble. The work around for what you want now is to add TZ info to the string: In [56]: d = np.array(['2001-01-01T12:00Z'], dtype='M8[ms]') In [57]: d.item(0) Out[57]: datetime.datetime(2001, 1, 1, 12, 0) or: In [60]: d = np.array(['2001-01-01T12:00-00:00'], dtype='M8[ms]') In [61]: d.item(0) Out[61]: datetime.datetime(2001, 1, 1, 12, 0) I _think_ that's what you want. This is what I mean that naive and UTC are almost the same. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndarray at mac.com Fri Mar 21 18:22:53 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Fri, 21 Mar 2014 18:22:53 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Fri, Mar 21, 2014 at 5:31 PM, Chris Barker wrote: > But this brings up a good point -- having time zone handling fully > compatible ith datetime.datetime would have its advantages. I don't know if everyone is aware of this, but Python stdlib has support for fixed-offset timezones since version 3.2: http://docs.python.org/3.2/whatsnew/3.2.html#datetime-and-time It took many years to bring in that feature, but now we can benefit from not having to reinvent the wheel. I will try to write up some specific proposal this weekend. -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 21 18:43:20 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 21 Mar 2014 22:43:20 +0000 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker wrote: > * I think there are more or less three options: > 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always > b) don't have any timezone handling at all -- all datetime64s are naive > (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) > 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) > 3) Full on proper TZ handling. > > I think (3) is off the table for now. > > I think (2) is what the NEP proposes, but I'd need more details, examples to know. > > I prefer 1(b), but 1(a) is close enough that I'd be happy with that, too. I think the first goal is to define what a plain vanilla datetime64 does, without any extra attributes. This is for two practical reasons: First, our overriding #1 goal is to fix the nasty I/O problems that default datetime64's show, so until that's done any other bells and whistles are a distraction. And second, adding parameters to dtypes right now is technically messy. This rules out (2) and (3). If we additionally want to keep the option of adding a timezone parameter later, and have the result end up looking like stdlib datetime, then I think 1(b) is the obvious choice. My guess is that this is also what's most compatible with pandas, which is currently keeping its own timezone object outside of the dtype. Any downsides? I guess this would mean that we start raising an error on ISO 8601's with offsets attached, which might annoy some people? > Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. Please no! An integer offset is a terrible way to represent timezones, and hardcoding this would just get in the way of a proper solution. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From alan.isaac at gmail.com Fri Mar 21 20:26:10 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 21 Mar 2014 20:26:10 -0400 Subject: [Numpy-discussion] unique return_index order? Message-ID: <532CD8A2.1080503@gmail.com> The documentation of numpy.unique http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html does not seem to promise that return_index=True will always index the *first* occurrence of each unique item, which I believe is the current behavior. A promise would be nice. Is it intended? Alan Isaac From josef.pktd at gmail.com Fri Mar 21 20:41:33 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 21 Mar 2014 20:41:33 -0400 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: <532CD8A2.1080503@gmail.com> References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 8:26 PM, Alan G Isaac wrote: > The documentation of numpy.unique > http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html > does not seem to promise that return_index=True will always index the > *first* occurrence of each unique item, which I believe is the current > behavior. > > A promise would be nice. > Is it intended? > AFAIU it's not, or it was in version, but shouldn't be. ?? I think this broke return_inverse in some cases if both were set to true I haven't kept track of the problems, and the code still seems to be the same that was changed to stable sorting. Josef > > Alan Isaac > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 21 20:46:02 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Mar 2014 18:46:02 -0600 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: <532CD8A2.1080503@gmail.com> References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac wrote: > The documentation of numpy.unique > http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html > does not seem to promise that return_index=True will always index the > *first* occurrence of each unique item, which I believe is the current > behavior. > > A promise would be nice. > Is it intended? > > Yes, it is intended, although the required mergesort wasn't available for all types before numpy 1.7. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 21 20:49:27 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 21 Mar 2014 20:49:27 -0400 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris wrote: > > > > On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac wrote: > >> The documentation of numpy.unique >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html >> does not seem to promise that return_index=True will always index the >> *first* occurrence of each unique item, which I believe is the current >> behavior. >> >> A promise would be nice. >> Is it intended? >> >> > Yes, it is intended, although the required mergesort wasn't available for > all types before numpy 1.7. > Does this mean return_inverse works again for all cases, even with return_index? I removed return_index from my code in statsmodels because I make frequent use of return_inverse, which was broken. We don't have any unittests in statsmodels anymore that use both return_xxx. Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 21 21:01:45 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 21 Mar 2014 19:01:45 -0600 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 6:49 PM, wrote: > > > > On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> >> On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac wrote: >> >>> The documentation of numpy.unique >>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html >>> does not seem to promise that return_index=True will always index the >>> *first* occurrence of each unique item, which I believe is the current >>> behavior. >>> >>> A promise would be nice. >>> Is it intended? >>> >>> >> Yes, it is intended, although the required mergesort wasn't available for >> all types before numpy 1.7. >> > > Does this mean return_inverse works again for all cases, even with > return_index? > > I removed return_index from my code in statsmodels because I make frequent > use of return_inverse, which was broken. We don't have any unittests in > statsmodels anymore that use both return_xxx. > > I don't know, needs checking. Seems to work now with a simple trial array of integers. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 21 21:27:13 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 21 Mar 2014 21:27:13 -0400 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 9:01 PM, Charles R Harris wrote: > > > > On Fri, Mar 21, 2014 at 6:49 PM, wrote: > >> >> >> >> On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> >>> On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac wrote: >>> >>>> The documentation of numpy.unique >>>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html >>>> does not seem to promise that return_index=True will always index the >>>> *first* occurrence of each unique item, which I believe is the current >>>> behavior. >>>> >>>> A promise would be nice. >>>> Is it intended? >>>> >>>> >>> Yes, it is intended, although the required mergesort wasn't available >>> for all types before numpy 1.7. >>> >> >> Does this mean return_inverse works again for all cases, even with >> return_index? >> >> I removed return_index from my code in statsmodels because I make >> frequent use of return_inverse, which was broken. We don't have any >> unittests in statsmodels anymore that use both return_xxx. >> >> > I don't know, needs checking. Seems to work now with a simple trial array > of integers. > my example from may 2012, thread "1.6.2 no more unique for rows" works fine on python 3.3 numpy 1.7.1 >>> groups = np.random.randint(0,4,size=(10,2)) >>> groups_ = groups.view([('',groups.dtype)]*groups.shape[1]).flatten() >>> uni, uni_idx, uni_inv = np.unique(groups_, return_index=True, return_inverse=True) >>> uni array([(0, 2), (0, 3), (1, 0), (2, 1), (2, 2), (3, 2), (3, 3)], dtype=[('f0', '>> uni_inv array([1, 6, 3, 4, 5, 3, 2, 5, 0, 2], dtype=int32) >>> np.__version__ '1.7.1' Thanks, Josef > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 21 21:37:14 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 21 Mar 2014 21:37:14 -0400 Subject: [Numpy-discussion] unique return_index order? In-Reply-To: References: <532CD8A2.1080503@gmail.com> Message-ID: On Fri, Mar 21, 2014 at 9:27 PM, wrote: > > > > On Fri, Mar 21, 2014 at 9:01 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> >> On Fri, Mar 21, 2014 at 6:49 PM, wrote: >> >>> >>> >>> >>> On Fri, Mar 21, 2014 at 8:46 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> >>>> On Fri, Mar 21, 2014 at 6:26 PM, Alan G Isaac wrote: >>>> >>>>> The documentation of numpy.unique >>>>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.unique.html >>>>> does not seem to promise that return_index=True will always index the >>>>> *first* occurrence of each unique item, which I believe is the current >>>>> behavior. >>>>> >>>>> A promise would be nice. >>>>> Is it intended? >>>>> >>>>> >>>> Yes, it is intended, although the required mergesort wasn't available >>>> for all types before numpy 1.7. >>>> >>> summary, AFAICS: since numpy 1.6.2 np.unique used mergesort if return_index=True and provides a stable sort. Josef > >>> Does this mean return_inverse works again for all cases, even with >>> return_index? >>> >>> I removed return_index from my code in statsmodels because I make >>> frequent use of return_inverse, which was broken. We don't have any >>> unittests in statsmodels anymore that use both return_xxx. >>> >>> >> I don't know, needs checking. Seems to work now with a simple trial array >> of integers. >> > > my example from may 2012, thread "1.6.2 no more unique for rows" > works fine on python 3.3 numpy 1.7.1 > > >>> groups = np.random.randint(0,4,size=(10,2)) > >>> groups_ = groups.view([('',groups.dtype)]*groups.shape[1]).flatten() > >>> uni, uni_idx, uni_inv = np.unique(groups_, return_index=True, > return_inverse=True) > >>> uni > array([(0, 2), (0, 3), (1, 0), (2, 1), (2, 2), (3, 2), (3, 3)], > dtype=[('f0', ' >>> uni_inv > array([1, 6, 3, 4, 5, 3, 2, 5, 0, 2], dtype=int32) > >>> np.__version__ > '1.7.1' > > Thanks, > > Josef > > >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 22 14:13:45 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 22 Mar 2014 18:13:45 +0000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ Message-ID: Hi all, After 88 emails we don't have a conclusion in the other thread (see [1] for background). But we have to come to some conclusion or another if we want @ to exist :-). So I'll summarize where the discussion stands and let's see if we can find some way to resolve this. The fundamental question is whether a chain like (a @ b @ c) should be evaluated left-to-right (left-associativity) or right-to-left (right-associativity). DATA SOURCE 1: This isn't a democratic vote, but it's useful to get a sense of people's intuitions. Counting messages in the other thread, opinion seems to be pretty evenly split: == "Votes" for right-associativity == Weak-right: [2] [3] [5] Tight-right: [4] [6] Same-right: [11] == "Votes" for left-associativity == Same-left: [7] [8] [14] [15] [16] Tight-left: [9] Weak-left: [12] There's also the "grouping" option (described in [10]), but that's received very little support (just [13]). DATA SOURCE 2: Several people have suggested that performance considerations mean that right-to-left evaluation is more common in practice than left-to-right evaluation. But, if we look at actual usage in Python code, that's not what we find: when people call dot() in chains, then they're about evenly split, and actually use the left-to-right, left-associative order slightly more often than the right-to-left, right-associative order: http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html DATA SOURCE 3: And if we look at other languages, then we find: == "Votes" for right-associativity == == "Votes" for left-associativity == Same-left: Matlab, Julia, IDL, GAUSS Tight-left: R And Mathematica uses the "grouping" approach. ARGUMENTS: The final outcome of this is that I need to write a piece of text that says what our (at least rough) consensus is, and lays out the reasons. So long as the "vote" is so evenly split, I can't really do this. But I can imagine what the different pieces of text might look like. THE CASE FOR LEFT-ASSOCIATIVITY: If I were writing this text in favor of left-associativity, I'd point out: - "Special cases aren't special enough to break the rules". Every single operator in Python besides ** is left-associative (and ** has very compelling arguments for right associativity). @ does not have similarly compelling arguments. If we were having this debate about "*", then it'd probably be much more lopsided towards left-associativity. So sure, there's something about @ that makes right-associativity *more* appealing than for most other operators. But not *much* more appealing -- left-associativity still comes out at least slightly ahead in all of the above measures. And there are a lot of benefits to avoiding special cases -- it gives fewer rules to memorize, fewer rules to remember, etc. So @ may be a special case, but it's not special enough. - Other languages with @ operators almost overwhelmingly use the "same-left" rule, and I've never heard anyone complain about this, so clearly nothing horrible will happen if we go this way. We have no comparable experience for right-associativity. - Given left-associativity, then there's good agreement about the appropriate precedence. If we choose right-associativity then it's much less clear (which will then make things harder for experts to remember, harder for non-experts to guess, etc.). Note that one of the votes for right-associativity even preferred the "same-right" rule, which is not even technically possible... This strikes me as a nice solid case. THE CASE FOR RIGHT-ASSOCIATIVITY: If I were writing this text in favor of right-associativity, I'd point out: - Because matrix multiplication has a tight conceptual association with function application/composition, many mathematically sophisticated users have an intuition that a matrix expression like R S x proceeds from right-to-left, with first S transforming x, and then R transforming the result. This isn't universally agreed, but at the least this intuition is more common than for other operations like 2 * 3 * 4 that everyone reads as going from left-to-right. - There might be some speed argument, if people often write things like "Mat @ Mat @ vec"? But no-one has found any evidence that people actually do write such things often. - There's been discussion of how right-associativity might maybe perhaps be nice for non-matmul applications? But I can't use those arguments [17] [18]. - ...... I got nothin'. I am fine with any outcome here. (I'm actually listed under *both* tight-right and same-left in the straw poll above ;-).) I'm totally happy to go back to Guido et al and argue for right-associativity. BUT if you all want me to do that then you need to give me some better arguments to use :-). One way to do this might be to go through the ((a @ b) @ c) and (a @ (b @ c)) examples I found (the scripts are available [19], and I can help modify them to spit out more details), look at the actual code, and demonstrate that the left-to-right ((a @ b) @ c) cases are mostly ones where evaluation order doesn't matter (i.e., they could have just as well been written the other way), and the right-to-left (a @ (b @ c)) ones are ones where right-to-left really is better than left-to-right. I have no idea if this is true, and it'll require some reading of the surrounding code to figure out what the matrix shapes are, but if it *is* true then at least that'd be something solid that right-associativity advocates could point to. WHAT NOW: If seeing this summary laid out caused you to change your mind one way or the other, then please reply and say so! If you think of some way to get more data that could favor one or the other option (running some kind of usability experiment on people? [20]), then please share! If you think of some other arguments in favor of left-associativity, then please share! If you think of some other arguments in favor of right-associativity, especially ones that are based on something besides your gut feeling, then PLEASE PLEASE share! Thanks, -n [1] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html [2] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069446.html [3] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069450.html [4] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069452.html [5] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069455.html [6] https://mail.python.org/pipermail/python-ideas/2014-March/027124.html [7] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069512.html [8] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069513.html [9] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069467.html [10] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html [11] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069537.html [12] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069540.html [13] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069571.html [14] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069514.html [15] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069531.html [16] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069567.html [17] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069527.html [18] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069584.html [19] https://gist.github.com/njsmith/9157645 [20] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069584.html -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From ndarray at mac.com Sat Mar 22 14:44:13 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 22 Mar 2014 14:44:13 -0400 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sat, Mar 22, 2014 at 2:13 PM, Nathaniel Smith wrote: > If you think of some other arguments in favor of left-associativity, > then please share! > I argued on python-ideas [1] that given the display properties of python lists and numpy arrays, vec @ Mat is more natural than Mat @ vec. The latter comes from an old tradition of laying out vector components vertically in print and on a blackboard, but horizontal layout is more natural at python prompt and in a text editor. [1] https://mail.python.org/pipermail/python-ideas/2014-March/027169.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat Mar 22 14:56:54 2014 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 22 Mar 2014 08:56:54 -1000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: <532DDCF6.9000908@hawaii.edu> On 2014/03/22 8:13 AM, Nathaniel Smith wrote: > Hi all, > > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist:-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. In case a "vote" from a previously "non-voting" reader helps: I think the case for same-left, as you state it, is strong; it's simple and easy to remember, and *this* *matters*. A *strong* argument would be needed to override this consideration, and I haven't seen any such strong argument. The basic advice to users is: be explicit--use parentheses as needed to show both the interpreter and readers of your code how you want the expression to be evaluated. Relying on precedence and associativity works only when the rules are well established by convention, and the expression is quite simple. Eric From eric at depagne.org Sat Mar 22 14:00:46 2014 From: eric at depagne.org (=?ISO-8859-1?Q?=C9ric?= Depagne) Date: Sat, 22 Mar 2014 21:00:46 +0300 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: <5078193.xtWOI2BqDU@localhost.localdomain> Hi Nate, Many thanks first for the efforts you put in this. I'm not a computer scientist, but will give my opinion as physicist. As such, when I see A x B x C (A, B and C being matrices), I tend to read it from right to left : Ax (BxC). But if the size of the matrices do not match like this, then I'll read it the other way round. Moreover, matrix diagonalization is always written so that the operations are done from right to left (P^-1 x A x P is read do AxP first, then multiply by P?1) ?ric. -- Un clavier azerty en vaut deux ---------------------------------------------------------- ?ric Depagne eric at depagne.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat Mar 22 15:59:36 2014 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 22 Mar 2014 19:59:36 +0000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: > Hi all, > > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. "The numpy community has no consensus strongly preferring one option over another" is a perfectly fine conclusion to this thread on numpy-discussion, IMO. Actually deciding what goes into the PEP given that input and merged with other concerns should probably happen on python-ideas. -- Robert Kern From andrea.gavana at gmail.com Sat Mar 22 16:10:10 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Sat, 22 Mar 2014 21:10:10 +0100 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: Hi, On 22 March 2014 19:13, Nathaniel Smith wrote: > Hi all, > > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. > > The fundamental question is whether a chain like (a @ b @ c) should be > evaluated left-to-right (left-associativity) or right-to-left > (right-associativity). > > I have been following this discussion and the PEP with much interest and, just to state the obvious, the addition of a matrix-multiplication operator in Python is way overdue. If I had to judge from the oil industry point of view only, in recent years the adoption of Python (and, by consequence, NumPy) as a number crunching platform has grown exponentially. I could cite dozens of non-performance-critical examples of commercial tools that switched from close-to-unmantainable Fortran/C implementations or (please forgive us...) hieroglyphic-style Perl code to Python. That said, if you're still interested in a social experiment about the precedence of "@", I can share a real-life one - albeit on a small sample of people (15). This is the background: 1. I'm about to teach an internal course on Python/NumPy/other in the company, so I polled the participants on their intuition about the "@" operator precedence; 2. We are *not* math gurus, but we do use NumPy on terabyte-scale data pretty much on a daily basis; 3. We are not "heavy" users of the "dot' method, but our various pieces of code contains quite a few calls to it; 4. All Python operators have left-associativity, excluding "**"; 5. Python code is read left-to-right. So, by asking the question: how do you interpret the expression "a @ b @ c", this is a summary of what I got from the participants: 1. Twelve (12) said they would interpret it as: "first do a at b, then matrix-multiply the results with c"; 2. Two (2) said they had no idea; 3. One (1) applied the right-associativity rule; 4. Whatever the Numpy-dev or Python-dev decision is, no one of us is ever, *ever* going to write "a @ b @ c" without parenthesis, to make clear the ordering of operations. I'm not going to pass judgments on the social experiment nor to invoke the Zen here, even though I fail to see how "@" is such a special case to break the standard rules. Not every NumPy user is a high-level math-educated person, or even if he/she is, he/she may have forgotten the basics of it. Why confuse him/her more? Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 22 16:19:17 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2014 14:19:17 -0600 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sat, Mar 22, 2014 at 12:13 PM, Nathaniel Smith wrote: > Hi all, > > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. > > The fundamental question is whether a chain like (a @ b @ c) should be > evaluated left-to-right (left-associativity) or right-to-left > (right-associativity). > > DATA SOURCE 1: > > This isn't a democratic vote, but it's useful to get a sense of > people's intuitions. Counting messages in the other thread, opinion > seems to be pretty evenly split: > > == "Votes" for right-associativity == > Weak-right: [2] [3] [5] > Tight-right: [4] [6] > Same-right: [11] > > == "Votes" for left-associativity == > Same-left: [7] [8] [14] [15] [16] > Tight-left: [9] > Weak-left: [12] > > There's also the "grouping" option (described in [10]), but that's > received very little support (just [13]). > > DATA SOURCE 2: > > Several people have suggested that performance considerations mean > that right-to-left evaluation is more common in practice than > left-to-right evaluation. But, if we look at actual usage in Python > code, that's not what we find: when people call dot() in chains, then > they're about evenly split, and actually use the left-to-right, > left-associative order slightly more often than the right-to-left, > right-associative order: > http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html > > DATA SOURCE 3: > > And if we look at other languages, then we find: > > == "Votes" for right-associativity == > > > == "Votes" for left-associativity == > Same-left: Matlab, Julia, IDL, GAUSS > Tight-left: R > > And Mathematica uses the "grouping" approach. > > ARGUMENTS: > > The final outcome of this is that I need to write a piece of text that > says what our (at least rough) consensus is, and lays out the reasons. > So long as the "vote" is so evenly split, I can't really do this. But > I can imagine what the different pieces of text might look like. > > THE CASE FOR LEFT-ASSOCIATIVITY: > > If I were writing this text in favor of left-associativity, I'd point out: > > - "Special cases aren't special enough to break the rules". Every > single operator in Python besides ** is left-associative (and ** has > very compelling arguments for right associativity). @ does not have > similarly compelling arguments. If we were having this debate about > "*", then it'd probably be much more lopsided towards > left-associativity. So sure, there's something about @ that makes > right-associativity *more* appealing than for most other operators. > But not *much* more appealing -- left-associativity still comes out at > least slightly ahead in all of the above measures. And there are a lot > of benefits to avoiding special cases -- it gives fewer rules to > memorize, fewer rules to remember, etc. So @ may be a special case, > but it's not special enough. > > - Other languages with @ operators almost overwhelmingly use the > "same-left" rule, and I've never heard anyone complain about this, so > clearly nothing horrible will happen if we go this way. We have no > comparable experience for right-associativity. > > - Given left-associativity, then there's good agreement about the > appropriate precedence. If we choose right-associativity then it's > much less clear (which will then make things harder for experts to > remember, harder for non-experts to guess, etc.). Note that one of the > votes for right-associativity even preferred the "same-right" rule, > which is not even technically possible... > > This strikes me as a nice solid case. > > THE CASE FOR RIGHT-ASSOCIATIVITY: > > If I were writing this text in favor of right-associativity, I'd point out: > > - Because matrix multiplication has a tight conceptual association > with function application/composition, many mathematically > sophisticated users have an intuition that a matrix expression like > R S x > proceeds from right-to-left, with first S transforming x, and then R > transforming the result. This isn't universally agreed, but at the > least this intuition is more common than for other operations like 2 * > 3 * 4 that everyone reads as going from left-to-right. > > - There might be some speed argument, if people often write things > like "Mat @ Mat @ vec"? But no-one has found any evidence that people > actually do write such things often. > > - There's been discussion of how right-associativity might maybe > perhaps be nice for non-matmul applications? But I can't use those > arguments [17] [18]. > > - ...... I got nothin'. > > I am fine with any outcome here. (I'm actually listed under *both* > tight-right and same-left in the straw poll above ;-).) I'm totally > happy to go back to Guido et al and argue for right-associativity. BUT > if you all want me to do that then you need to give me some better > arguments to use :-). > > One way to do this might be to go through the ((a @ b) @ c) and (a @ > (b @ c)) examples I found (the scripts are available [19], and I can > help modify them to spit out more details), look at the actual code, > and demonstrate that the left-to-right ((a @ b) @ c) cases are mostly > ones where evaluation order doesn't matter (i.e., they could have just > as well been written the other way), and the right-to-left (a @ (b @ > c)) ones are ones where right-to-left really is better than > left-to-right. I have no idea if this is true, and it'll require some > reading of the surrounding code to figure out what the matrix shapes > are, but if it *is* true then at least that'd be something solid that > right-associativity advocates could point to. > > WHAT NOW: > > If seeing this summary laid out caused you to change your mind one way > or the other, then please reply and say so! > > If you think of some way to get more data that could favor one or the > other option (running some kind of usability experiment on people? > [20]), then please share! > > If you think of some other arguments in favor of left-associativity, > then please share! > > If you think of some other arguments in favor of right-associativity, > especially ones that are based on something besides your gut feeling, > then PLEASE PLEASE share! > > Well, I this point I think we might as well go with left associativity. Most of the operator uses looked to involve a single `@`, where it doesn't matter, and the others were short where adding a couple of parenthesis wouldn't mess things up too much. The long expressions I've seen also tend to group naturally and probably one would either compute the parts separately or use parenthesis for clarity in any case. So I think the case for the practical usefulness of right associativity is weak at this point. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 22 22:07:52 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2014 20:07:52 -0600 Subject: [Numpy-discussion] 1.9.0 release runup Message-ID: Hi All, It is time to start looking forward to the 1.9.0 release. Currently there are some 76 open PRs and they keep rolling in, which is good, but we need to decide on what is important for 1.9 and what can be put off to 1.10 because otherwise we will never finish. The datetime problems and some of the deprecations/futurewarnings that were present in 1.8 need to be dealt with. The nanmedian stuff will make a nice addition to the nan functions. Apart from those, if you have a PR or fix that you think needs to be in 1.9, please make it known. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sat Mar 22 22:07:29 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 23 Mar 2014 02:07:29 +0000 (UTC) Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ References: Message-ID: <1503947617417232908.655877sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > Well, I this point I think we might as well go with left associativity. > Most of the operator uses looked to involve a single `@`, where it doesn't > matter, and the others were short where adding a couple of parenthesis > wouldn't mess things up too much. That is what most Python operators do. Right associativity will just be confusing. ** is right associative because of the way exponentiation is written in text. As I see it, left associativity of @ is clearly the better option. Sturla From njs at pobox.com Sat Mar 22 22:14:36 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 23 Mar 2014 02:14:36 +0000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sat, Mar 22, 2014 at 7:59 PM, Robert Kern wrote: > On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: >> Hi all, >> >> After 88 emails we don't have a conclusion in the other thread (see >> [1] for background). But we have to come to some conclusion or another >> if we want @ to exist :-). So I'll summarize where the discussion >> stands and let's see if we can find some way to resolve this. > > "The numpy community has no consensus strongly preferring one option > over another" is a perfectly fine conclusion to this thread on > numpy-discussion, IMO. Actually deciding what goes into the PEP given > that input and merged with other concerns should probably happen on > python-ideas. Yep, if we converge on deadlock then that's what we'll do, but I'm not yet convinced that we've converged at all. In the last few hours the "vote" deltas are right -1, left +3... -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From charlesr.harris at gmail.com Sat Mar 22 22:28:18 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2014 20:28:18 -0600 Subject: [Numpy-discussion] Numpy 1.8.1 release Message-ID: Hi All, It is time for the 1.8.1 release to go forward. I'm on the fence as to whether to do an rc2 or just release and do a 1.8.2 if needed. The problems noted with the 1.8.1rc1 should be fixed, but if you are in a position to test the current 1.8.x branch, please give it a try. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Sat Mar 22 22:35:05 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sun, 23 Mar 2014 02:35:05 +0000 (UTC) Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ References: Message-ID: <1725835203417233589.705803sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > - There might be some speed argument, if people often write things > like "Mat @ Mat @ vec"? But no-one has found any evidence that people > actually do write such things often. With left associativity, this would be an algorithmic optimization: Mat @ (Mat @ vec) Mat @ (Mat @ (Mat @ vec)) On the other hand, this vec.T @ Mat @ Mat would not need parentheses for optimisation when the associativity is left. With right associativity, we get the same optimisation problem as well: (vec.T @ Mat) @ Mat ((vec.T @ Mat) @ Mat) @ Mat Personally I believe this advice to the novice programmer belongs in the documentation. If we just include it in the NumPy documentation, it will not be a problem. Advices about how to optimize numerical expressions should not be special cases in the syntax, in my opinion. That just makes the Python language harder to learn. Rather, it is a documentation problem for NumPy. We should write the NumPy documentation for @ such that the novice programmer easily understands the computational complexities of linear algebra operations. The PEP might include this as well, so the knowledge propagates into the rest of the Python language litterature, not just the NumPy docs. By the way, the * operator for np.matrix and Matlab matrices are left associative as well. This does not produce any problems. Sturla From ndarray at mac.com Sat Mar 22 22:51:06 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Sat, 22 Mar 2014 22:51:06 -0400 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: <1725835203417233589.705803sturla.molden-gmail.com@news.gmane.org> References: <1725835203417233589.705803sturla.molden-gmail.com@news.gmane.org> Message-ID: On Sat, Mar 22, 2014 at 10:35 PM, Sturla Molden wrote: > On the other hand, this > > vec.T @ Mat @ Mat > > would not need parentheses for optimisation when the associativity is left. > > Nor does it require .T if vec is 1d. > > By the way, the * operator for np.matrix and Matlab matrices are left > associative as well. > This is a very strong argument, IMO. If we want to win over the hearts of np.matrix users, we should not tell them - BTW - treat @ as you do **, not *. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Sun Mar 23 00:12:38 2014 From: cgohlke at uci.edu (Christoph Gohlke) Date: Sat, 22 Mar 2014 21:12:38 -0700 Subject: [Numpy-discussion] Numpy 1.8.1 release In-Reply-To: References: Message-ID: <532E5F36.3090206@uci.edu> On 3/22/2014 7:28 PM, Charles R Harris wrote: > Hi All, > > It is time for the 1.8.1 release to go forward. I'm on the fence as to > whether to do an rc2 or just release and do a 1.8.2 if needed. The > problems noted with the 1.8.1rc1 should be fixed, but if you are in a > position to test the current 1.8.x branch, please give it a try. > > Chuck > Hello, LGTM: all builds and tests are passing on Windows with msvc and MKL. Christoph From charlesr.harris at gmail.com Sun Mar 23 00:41:28 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 22 Mar 2014 22:41:28 -0600 Subject: [Numpy-discussion] Numpy 1.8.1 release In-Reply-To: <532E5F36.3090206@uci.edu> References: <532E5F36.3090206@uci.edu> Message-ID: On Sat, Mar 22, 2014 at 10:12 PM, Christoph Gohlke wrote: > On 3/22/2014 7:28 PM, Charles R Harris wrote: > > Hi All, > > > > It is time for the 1.8.1 release to go forward. I'm on the fence as to > > whether to do an rc2 or just release and do a 1.8.2 if needed. The > > problems noted with the 1.8.1rc1 should be fixed, but if you are in a > > position to test the current 1.8.x branch, please give it a try. > > > > Chuck > > > > Hello, > > LGTM: all builds and tests are passing on Windows with msvc and MKL. > > Thanks Christoph, that gives me a lot more confidence. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 23 06:28:02 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 23 Mar 2014 11:28:02 +0100 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sun, Mar 23, 2014 at 3:14 AM, Nathaniel Smith wrote: > On Sat, Mar 22, 2014 at 7:59 PM, Robert Kern > wrote: > > On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: > >> Hi all, > >> > >> After 88 emails we don't have a conclusion in the other thread (see > >> [1] for background). But we have to come to some conclusion or another > >> if we want @ to exist :-). So I'll summarize where the discussion > >> stands and let's see if we can find some way to resolve this. > > > > "The numpy community has no consensus strongly preferring one option > > over another" is a perfectly fine conclusion to this thread on > > numpy-discussion, IMO. Actually deciding what goes into the PEP given > > that input and merged with other concerns should probably happen on > > python-ideas. > > Yep, if we converge on deadlock then that's what we'll do, but I'm not > yet convinced that we've converged at all. In the last few hours the > "vote" deltas are right -1, left +3... > If it helps, my +1 is also for left associative. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From projetmbc at gmail.com Sun Mar 23 06:54:58 2014 From: projetmbc at gmail.com (Christophe Bal) Date: Sun, 23 Mar 2014 11:54:58 +0100 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: The left associativity should be the less disturbing choice. Using parenthesis to force right associativity will be not too painful. How many matrices in long product are involved concretely in Numpy projects ? On the other hand, Nathaniel proposed a new way to evaluate associative operators, and I really think that would be the best way to manage products of matrices. The idea is fir example to see A at B@C at D as __atmul__(A, B, C, D). Christophe BAL -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 23 07:24:54 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 23 Mar 2014 12:24:54 +0100 Subject: [Numpy-discussion] Numpy 1.8.1 release In-Reply-To: <532E5F36.3090206@uci.edu> References: <532E5F36.3090206@uci.edu> Message-ID: On Sun, Mar 23, 2014 at 5:12 AM, Christoph Gohlke wrote: > On 3/22/2014 7:28 PM, Charles R Harris wrote: > > Hi All, > > > > It is time for the 1.8.1 release to go forward. I'm on the fence as to > > whether to do an rc2 or just release and do a 1.8.2 if needed. The > > problems noted with the 1.8.1rc1 should be fixed, but if you are in a > > position to test the current 1.8.x branch, please give it a try. > > > > Chuck > > > > Hello, > > LGTM: all builds and tests are passing on Windows with msvc and MKL. > All OK on 32-bit Linux and with scipy 0.14.x as well. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Mar 23 08:56:53 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 23 Mar 2014 13:56:53 +0100 Subject: [Numpy-discussion] 1.9.0 release runup In-Reply-To: References: Message-ID: On Sun, Mar 23, 2014 at 3:07 AM, Charles R Harris wrote: > Hi All, > > It is time to start looking forward to the 1.9.0 release. Currently there > are some 76 open PRs and they keep rolling in, which is good, > To make the PR list a bit more manageable, I would suggest to start closing the ones which are not in a state to get merged and haven't seen activity by the author for >3 months. And add in the dev guide that this is normal policy and that authors are free to reopen the PR when they continue working on it. but we need to decide on what is important for 1.9 and what can be put off > to 1.10 because otherwise we will never finish. The datetime problems and > some of the deprecations/futurewarnings that were present in 1.8 need to be > dealt with. The nanmedian stuff will make a nice addition to the nan > functions. Apart from those, if you have a PR or fix that you think needs > to be in 1.9, please make it known. > The boolean subtract and ellipsis indexing deprecations probably need reconsidering. I get 78 test errors right now because of those if I test scipy master against numpy master. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Mar 23 09:19:13 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 23 Mar 2014 07:19:13 -0600 Subject: [Numpy-discussion] Numpy 1.8.1 release In-Reply-To: References: <532E5F36.3090206@uci.edu> Message-ID: On Sun, Mar 23, 2014 at 5:24 AM, Ralf Gommers wrote: > > > > On Sun, Mar 23, 2014 at 5:12 AM, Christoph Gohlke wrote: > >> On 3/22/2014 7:28 PM, Charles R Harris wrote: >> > Hi All, >> > >> > It is time for the 1.8.1 release to go forward. I'm on the fence as to >> > whether to do an rc2 or just release and do a 1.8.2 if needed. The >> > problems noted with the 1.8.1rc1 should be fixed, but if you are in a >> > position to test the current 1.8.x branch, please give it a try. >> > >> > Chuck >> > >> >> Hello, >> >> LGTM: all builds and tests are passing on Windows with msvc and MKL. >> > > All OK on 32-bit Linux and with scipy 0.14.x as well. > > Thanks for checking Ralf. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Mar 23 09:26:33 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 23 Mar 2014 07:26:33 -0600 Subject: [Numpy-discussion] 1.9.0 release runup In-Reply-To: References: Message-ID: On Sun, Mar 23, 2014 at 6:56 AM, Ralf Gommers wrote: > > > > On Sun, Mar 23, 2014 at 3:07 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> It is time to start looking forward to the 1.9.0 release. Currently there >> are some 76 open PRs and they keep rolling in, which is good, >> > > To make the PR list a bit more manageable, I would suggest to start > closing the ones which are not in a state to get merged and haven't seen > activity by the author for >3 months. And add in the dev guide that this is > normal policy and that authors are free to reopen the PR when they continue > working on it. > I'd feel better about doing that if PR's were reviewed and dealt with on a regular basis, but we aren't quite there yet. That said, I'd like to keep the number down in the 30-40 range. > > but we need to decide on what is important for 1.9 and what can be put off >> to 1.10 because otherwise we will never finish. The datetime problems and >> some of the deprecations/futurewarnings that were present in 1.8 need to be >> dealt with. The nanmedian stuff will make a nice addition to the nan >> functions. Apart from those, if you have a PR or fix that you think needs >> to be in 1.9, please make it known. >> > > The boolean subtract and ellipsis indexing deprecations probably need > reconsidering. I get 78 test errors right now because of those if I test > scipy master against numpy master. > > That's a lot of errors. Do you think they should be reverted permanently or just for 1.9? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Mar 23 15:30:28 2014 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 23 Mar 2014 20:30:28 +0100 Subject: [Numpy-discussion] 1.9.0 release runup In-Reply-To: References: Message-ID: <1395603028.5553.5.camel@sebastian-t440> On So, 2014-03-23 at 07:26 -0600, Charles R Harris wrote: > > > > On Sun, Mar 23, 2014 at 6:56 AM, Ralf Gommers > wrote: > > > > On Sun, Mar 23, 2014 at 3:07 AM, Charles R Harris > wrote: > Hi All, > > > It is time to start looking forward to the 1.9.0 > release. Currently there are some 76 open PRs and they > keep rolling in, which is good, > > > To make the PR list a bit more manageable, I would suggest to > start closing the ones which are not in a state to get merged > and haven't seen activity by the author for >3 months. And add > in the dev guide that this is normal policy and that authors > are free to reopen the PR when they continue working on it. > > > > I'd feel better about doing that if PR's were reviewed and dealt with > on a regular basis, but we aren't quite there yet. That said, I'd like > to keep the number down in the 30-40 range. > > > > > but we need to decide on what is important for 1.9 and > what can be put off to 1.10 because otherwise we will > never finish. The datetime problems and some of the > deprecations/futurewarnings that were present in 1.8 > need to be dealt with. The nanmedian stuff will make a > nice addition to the nan functions. Apart from those, > if you have a PR or fix that you think needs to be in > 1.9, please make it known. > > > > The boolean subtract and ellipsis indexing deprecations > probably need reconsidering. I get 78 test errors right now > because of those if I test scipy master against numpy master. > > > > > > That's a lot of errors. Do you think they should be reverted > permanently or just for 1.9? Good question. Just to note, I don't mind reverting/removing these. I was somewhat aware that the double ellipsis caused a lot scipy failures, but they seemed mostly in the tests with code like `arr[..., ...]` and I didn't check if it might be more trouble then gain. - Sebastian > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From ralf.gommers at gmail.com Sun Mar 23 17:18:30 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 23 Mar 2014 22:18:30 +0100 Subject: [Numpy-discussion] 1.9.0 release runup In-Reply-To: <1395603028.5553.5.camel@sebastian-t440> References: <1395603028.5553.5.camel@sebastian-t440> Message-ID: On Sun, Mar 23, 2014 at 8:30 PM, Sebastian Berg wrote: > On So, 2014-03-23 at 07:26 -0600, Charles R Harris wrote: > > > > > > > > On Sun, Mar 23, 2014 at 6:56 AM, Ralf Gommers > > wrote: > > > > > > > > On Sun, Mar 23, 2014 at 3:07 AM, Charles R Harris > > wrote: > > Hi All, > > > > > > It is time to start looking forward to the 1.9.0 > > release. Currently there are some 76 open PRs and they > > keep rolling in, which is good, > > > > > > To make the PR list a bit more manageable, I would suggest to > > start closing the ones which are not in a state to get merged > > and haven't seen activity by the author for >3 months. And add > > in the dev guide that this is normal policy and that authors > > are free to reopen the PR when they continue working on it. > > > > > > > > I'd feel better about doing that if PR's were reviewed and dealt with > > on a regular basis, but we aren't quite there yet. That said, I'd like > > to keep the number down in the 30-40 range. > > > > > > > > > > but we need to decide on what is important for 1.9 and > > what can be put off to 1.10 because otherwise we will > > never finish. The datetime problems and some of the > > deprecations/futurewarnings that were present in 1.8 > > need to be dealt with. The nanmedian stuff will make a > > nice addition to the nan functions. Apart from those, > > if you have a PR or fix that you think needs to be in > > 1.9, please make it known. > > > > > > > > The boolean subtract and ellipsis indexing deprecations > > probably need reconsidering. I get 78 test errors right now > > because of those if I test scipy master against numpy master. > > > > > > > > > > > > That's a lot of errors. Do you think they should be reverted > > permanently or just for 1.9? > Temporarily probably. Assuming they were a good idea to start with. > Good question. Just to note, I don't mind reverting/removing these. I > was somewhat aware that the double ellipsis caused a lot scipy failures, > but they seemed mostly in the tests with code like `arr[..., ...]` and I > didn't check if it might be more trouble then gain. > IIRC we had something like this before with the safe casting changes in 1.6.x. We could do the following: 1. fix the issues seen in scipy (and scikits etc.) now. 2. revert this change for 1.9.x so it doesn't cause issues with released versions. 3. re-introduce the deprecations in a year or so. In a year scipy will have 2 released versions with the fixes from (1). Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at pagan.io Sun Mar 23 23:06:35 2014 From: matt at pagan.io (Matt Pagan) Date: Mon, 24 Mar 2014 03:06:35 +0000 Subject: [Numpy-discussion] Implementing elementary matrices Message-ID: <532FA13B.2070909@pagan.io> Greetings! I made a patch for NumPy that adds a function for easily creating elementary matrices. Sorry for not knowing the process for submitting patches. Is this function something the NumPy community could see adding to the codebase? Are there ways I can improve on this? diff --git a/numpy/lib/twodim_base.py b/numpy/lib/twodim_base.py index 12c0f9b..10073af 100644 --- a/numpy/lib/twodim_base.py +++ b/numpy/lib/twodim_base.py @@ -967,3 +967,85 @@ def triu_indices_from(arr, k=0): if not (arr.ndim == 2 and arr.shape[0] == arr.shape[1]): raise ValueError("input array must be 2-d and square") return triu_indices(arr.shape[0], k) + +def elem(N, i, j=None, t=None, dtype=float): + """ + Return an elementary matrix. + + Parameters + ---------- + N : int + The size of the NxN array to be returned. Elementary matrices + should be square. + i : int + The index of the first row on which operations are to be + performed. + j : int + If set, the index of the second row of which operations are to + be performed. + t : scalar + If set, the factor by which a given row will be multiplied. + + Returns + ------- + m: ndarray of shape (NxN) + The identity matrix after a single row operation has been + performed on it. + + See also + -------- + eye, identity + + Examples + ------- + To swap the the first and third rows of a 4x4 identity matirx: + + >>> L = elem(4, 0, 2) + >>> L + array([[ 0., 0., 1., 0.], + [ 0., 1., 0., 0.], + [ 1., 0., 0., 0.], + [ 0., 0., 0., 1.]]) + + This array then becomes quite useful for matrix multiplication. + + >>> H = np.matrix([[ 2, 3, 5, 7], + [11, 13, 17, 19], + [23, 29, 31, 37], + [41, 43, 47, 53]]) + >>> L*H + matrix([[ 23., 29., 31., 37.], + [ 11., 13., 17., 19.], + [ 2., 3., 5., 7.], + [ 41., 43., 47., 53.]]) + + When the elemntary matrix is multiplied by the given matrix, the + result is the given matrix with it's first and third rows swapped. + + If the given matrix is multiplied by the elementary matrix (i.e., + the multiplication takes place in reverse order, the result is + the given matrix with its first and third columns swapped. + + >>> H*L + matrix([[ 5., 3., 2., 7.], + [ 17., 13., 11., 19.], + [ 31., 29., 23., 37.], + [ 47., 43., 41., 53.]]) + + """ + m=eye(N, dtype=dtype) + if j==None and t==None: + raise ValueError("One or more of %s and %s must be set." % \ + ('j', 't')) + return None + elif t==None: + swap = np.array(m[i]) + m[i] = m[j] + m[j] = swap + return m + elif j==None: + m[i] *= t + return m + else: + m[j] += (t * m[i]) + return m -- Matt Pagan matt at pagan.io PGP: 0xE9284418E360583C -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: OpenPGP digital signature URL: From chris.barker at noaa.gov Mon Mar 24 00:39:27 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Sun, 23 Mar 2014 21:39:27 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith wrote: > On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker > wrote: > > * I think there are more or less three options: > > 1) a) don't have any timezone handling at all -- all datetime64s are > UTC. Always > > b) don't have any timezone handling at all -- all datetime64s > are naive > > (the only difference between these two is I/O of strings, > and maybe I/O of datetime objects with a time zone) > > 2) Have a time zone associated with the array -- defaulting to > either UTC or None, but don't provide any implementation other than the > tagging, with the ability to add in TZ handler if you want (can this be > done efficiently?) > > 3) Full on proper TZ handling. > > > > I think (3) is off the table for now. > > I think the first goal is to define what a plain vanilla datetime64 > does, without any extra attributes. This is for two practical reasons: > First, our overriding #1 goal is to fix the nasty I/O problems that > default datetime64's show, so until that's done any other bells and > whistles are a distraction. And second, adding parameters to dtypes > right now is technically messy. > > This rules out (2) and (3). > yup -- though I'm not sure I agree that we need to do this, if we are going to do something more later anyway. But you have a key point - maybe the dtype system simply isn't ready to do it right, and then it may be better not to try. In which case, we are down to naive or always UTC -- and again, those really aren't very different. Though I prefer naive -- always UTC adds some complication if you don't actually want UTC, and I'm not sure it actually buys us anything. And maybe it's jsut me, but all my code would need to use naive, so I"d be doing a bit of working around to use a UTC-always system. > If we additionally want to keep the option of adding a timezone > parameter later, and have the result end up looking like stdlib > datetime, then I think 1(b) is the obvious choice. My guess is that > this is also what's most compatible with pandas, which is currently > keeping its own timezone object outside of the dtype. > Good point, all else being equal, compatability with Pandas would be a good thing. Any downsides? I guess this would mean that we start raising an error > on ISO 8601's with offsets attached, which might annoy some people? > yes, but errors are better than incorrect values... > Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. Please no! An integer offset is a terrible way to represent timezones, > well, it would solve the being able to read ISO strings problem, and being able to perform operations with datetimes in multiple time zones. though I guess you could get most of that with UTC-always. > and hardcoding this would just get in the way of a proper solution. > well, that's a point -- if we think there is any hope of a proper solution down the road, then yes, it would be better not to make that harder. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Mon Mar 24 02:39:50 2014 From: questions.anon at gmail.com (questions anon) Date: Mon, 24 Mar 2014 17:39:50 +1100 Subject: [Numpy-discussion] numpy sum each month and then numpy mean of all months Message-ID: Hello all, I have netcdf files that contain hourly rainfall data. Each netcdf file includes one months worth of hours and I have 10 years worth of data. I would like to calculate the sum of each month and then the mean of these summed months across all of the years. I have no problem firstly calculating the sum of each month but then I come up with a mask-size error when I try to calculate the mean of those months. So somehow I am not combining my summed months correctly? Below is the code I am using (focusing on just january at this stage) and below that is the error. Any feedback will be greatly appreciated. from netCDF4 import Dataset import numpy as N import matplotlib.pyplot as plt from mpl_toolkits.basemap import Basemap import os shapefile1="/Users/REGIONS" OutputFolder=r"/Users/rainmonthlysummarystats/" fileforlatlon=Dataset("/Users/WRFsample/IDV71000_VIC_T_SFC.nc", 'r+', 'NETCDF4') LAT=fileforlatlon.variables['latitude'][:] LON=fileforlatlon.variables['longitude'][:] def summaryplots(variable): if variable=='RAIN': ncvariablename='PCP_SFC' MainFolder=r"/Data/WRFmonthly/" ticks=[0, 25, 50, 75, 100, 125, 150, 175, 200] cmap=plt.cm.jet Jan="01" monthseason="Jan" summonthlyrain_all=[] all_variabledata=[] for (path, dirs, files) in os.walk(MainFolder): if os.path.basename(path)==Jan: for ncfile in files: fileext=ncvariablename+'.nc' if ncfile.endswith(fileext): print "dealing with ncfiles:", path+ncfile ncfile=os.path.join(path,ncfile) ncfile=Dataset(ncfile, 'r+', 'NETCDF4') variable=ncfile.variables[ncvariablename][:,:,:] ncfile.close() all_variabledata.append(variable) #combine all data from the chosen variable to make one array for analyses big_array=N.ma.concatenate(all_variabledata) SUM=big_array.sum(axis=0) summonthlyrain_all.append(SUM) del all_variabledata[:] big_array_sumrain=N.ma.concatenate(summonthlyrain_all) MEAN=big_array_sumrain.mean(axis=0) #plot output summary stats map = Basemap(projection='merc',llcrnrlat=-40,urcrnrlat=-33, llcrnrlon=139.0,urcrnrlon=151.0,lat_ts=0,resolution='i') map.drawcoastlines() map.drawstates() map.readshapefile(shapefile1, 'DSE_REGIONS') x,y=map(*N.meshgrid(LON,LAT)) plottitle=ncvariablename+'_mean_'+monthseason plt.title(plottitle) CS = map.contourf(x,y,MEAN, ticks, cmap=cmap) l,b,w,h =0.1,0.1,0.8,0.8 cax = plt.axes([l+w+0.025, b, 0.025, h]) plt.colorbar(CS,cax=cax, drawedges=True) plt.savefig((os.path.join(OutputFolder, plottitle+'.png'))) plt.show() plt.close() summaryplots('RAIN') MaskError Traceback (most recent call last) /Applications/Canopy.app/appdata/canopy-1.3.0.1715.macosx-x86_64/Canopy.app/Contents/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where) 202 else: 203 filename = fname --> 204 __builtin__.execfile(filename, *where) /Users/slburns/Dropbox/Python_code/WRFoutputs/plot_variable_sum_monthlyrain_percentiles_test.py in () 71 72 ---> 73 summaryplots('RAIN') 74 75 /Users/slburns/Dropbox/Python_code/WRFoutputs/plot_variable_sum_monthlyrain_percentiles_test.py in summaryplots(variable) 62 plottitle=ncvariablename+'_mean_'+monthseason 63 plt.title(plottitle) ---> 64 CS = map.contourf(x,y,MEAN, ticks, cmap=cmap) 65 l,b,w,h =0.1,0.1,0.8,0.8 66 cax = plt.axes([l+w+0.025, b, 0.025, h]) /Users/slburns/Applications/User/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.pyc in with_transform(self, x, y, data, *args, **kwargs) 519 # convert lat/lon coords to map projection coords. 520 x, y = self(x,y) --> 521 return plotfunc(self,x,y,data,*args,**kwargs) 522 return with_transform 523 /Users/slburns/Applications/User/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.pyc in contourf(self, x, y, data, *args, **kwargs) 3670 # combine with data mask. 3671 mask = np.logical_or(ma.getmaskarray(data),xymask) -> 3672 data = ma.masked_array(data,mask=mask) 3673 CS = ax.contourf(x,y,data,*args,**kwargs) 3674 except: /Users/slburns/Applications/User/lib/python2.7/site-packages/numpy/ma/core.pyc in __new__(cls, data, mask, dtype, copy, subok, ndmin, fill_value, keep_mask, hard_mask, shrink, **options) 2708 msg = "Mask and data not compatible: data size is %i, " + 2709 "mask size is %i." -> 2710 raise MaskError(msg % (nd, nm)) 2711 copy = True 2712 # Set the mask to the new value MaskError: Mask and data not compatible: data size is 193, mask size is 20458. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Mon Mar 24 03:32:47 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Mon, 24 Mar 2014 08:32:47 +0100 Subject: [Numpy-discussion] Implementing elementary matrices In-Reply-To: <532FA13B.2070909@pagan.io> References: <532FA13B.2070909@pagan.io> Message-ID: Sounds (marginally) useful; although elementary row/column operations are in practice usually better implemented directly by indexing rather than in an operator form. Though I can see a use for the latter. My suggestion: its not a common enough operation to deserve a 4 letter acronym (assuming those are good things in any context). A full 'elementary' would be much preferable I think. On Mon, Mar 24, 2014 at 4:06 AM, Matt Pagan wrote: > Greetings! > I made a patch for NumPy that adds a function for easily creating > elementary matrices. Sorry for not knowing the process for submitting > patches. > > Is this function something the NumPy community could see adding to the > codebase? Are there ways I can improve on this? > > diff --git a/numpy/lib/twodim_base.py b/numpy/lib/twodim_base.py > index 12c0f9b..10073af 100644 > --- a/numpy/lib/twodim_base.py > +++ b/numpy/lib/twodim_base.py > @@ -967,3 +967,85 @@ def triu_indices_from(arr, k=0): > if not (arr.ndim == 2 and arr.shape[0] == arr.shape[1]): > raise ValueError("input array must be 2-d and square") > return triu_indices(arr.shape[0], k) > + > +def elem(N, i, j=None, t=None, dtype=float): > + """ > + Return an elementary matrix. > + > + Parameters > + ---------- > + N : int > + The size of the NxN array to be returned. Elementary matrices > + should be square. > + i : int > + The index of the first row on which operations are to be > + performed. > + j : int > + If set, the index of the second row of which operations are to > + be performed. > + t : scalar > + If set, the factor by which a given row will be multiplied. > + > + Returns > + ------- > + m: ndarray of shape (NxN) > + The identity matrix after a single row operation has been > + performed on it. > + > + See also > + -------- > + eye, identity > + > + Examples > + ------- > + To swap the the first and third rows of a 4x4 identity matirx: > + > + >>> L = elem(4, 0, 2) > + >>> L > + array([[ 0., 0., 1., 0.], > + [ 0., 1., 0., 0.], > + [ 1., 0., 0., 0.], > + [ 0., 0., 0., 1.]]) > + > + This array then becomes quite useful for matrix multiplication. > + > + >>> H = np.matrix([[ 2, 3, 5, 7], > + [11, 13, 17, 19], > + [23, 29, 31, 37], > + [41, 43, 47, 53]]) > + >>> L*H > + matrix([[ 23., 29., 31., 37.], > + [ 11., 13., 17., 19.], > + [ 2., 3., 5., 7.], > + [ 41., 43., 47., 53.]]) > + > + When the elemntary matrix is multiplied by the given matrix, the > + result is the given matrix with it's first and third rows swapped. > + > + If the given matrix is multiplied by the elementary matrix (i.e., > + the multiplication takes place in reverse order, the result is > + the given matrix with its first and third columns swapped. > + > + >>> H*L > + matrix([[ 5., 3., 2., 7.], > + [ 17., 13., 11., 19.], > + [ 31., 29., 23., 37.], > + [ 47., 43., 41., 53.]]) > + > + """ > + m=eye(N, dtype=dtype) > + if j==None and t==None: > + raise ValueError("One or more of %s and %s must be set." % \ > + ('j', 't')) > + return None > + elif t==None: > + swap = np.array(m[i]) > + m[i] = m[j] > + m[j] = swap > + return m > + elif j==None: > + m[i] *= t > + return m > + else: > + m[j] += (t * m[i]) > + return m > > -- > Matt Pagan > matt at pagan.io > PGP: 0xE9284418E360583C > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Mon Mar 24 05:20:33 2014 From: pierre.haessig at crans.org (Pierre Haessig) Date: Mon, 24 Mar 2014 10:20:33 +0100 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: <532FF8E1.6090707@crans.org> Hi, Le 22/03/2014 19:13, Nathaniel Smith a ?crit : > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. Thanks for this nice summary. I found the previous thread very interesting to follow. My first reaction when the associativity question was raised was : why would anyone want this non-standard right-associativity ? Indeed, I don't see the special case of Mat*Mat*vec to be important enough to break the common convention. Then, I almost got convinced by the function composition argument. It looks quite elegant, but in the end, there is no current mainstream usage. Also, somebody could use the new @ operator to perform function composition (and not function application) like this : (f at g@h)(x) I don't know where it could be used, but there could be some fun. Then, there is the multiplication chaining, but that belongs clearly to some other discussion, because that's getting close to lazy evaluation with underlying operation optimization. Indeed, why stop at just optimizating just the multiplication ?For this kind of expression-level optim, there is clearly no standard solution : numexpr, theano, ... ? (not familiar with this topic) So +1 for a plain and simple left associativity. (for weak or same, I don't think I understood all the aspects, but I feel that "same" is also the simplest choice) best, Pierre From alan.isaac at gmail.com Mon Mar 24 11:32:12 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 24 Mar 2014 11:32:12 -0400 Subject: [Numpy-discussion] why sort does not accept a key? Message-ID: <53304FFC.2070801@gmail.com> I'm wondering if `sort` intentially does not accept a `key` or if this is just a missing feature? (I suppose that if the `order` argument is specified it would have to accept a sequence of keys ...) Thanks, Alan Isaac From ndarray at mac.com Mon Mar 24 11:47:07 2014 From: ndarray at mac.com (Alexander Belopolsky) Date: Mon, 24 Mar 2014 11:47:07 -0400 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: <53304FFC.2070801@gmail.com> References: <53304FFC.2070801@gmail.com> Message-ID: On Mon, Mar 24, 2014 at 11:32 AM, Alan G Isaac wrote: > I'm wondering if `sort` intentionally does not accept a `key` > or if this is just a missing feature? > It would be very inefficient to call a key function on every element compared during the sort. See np.argsort and np.lexsort for faster alternatives. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Mon Mar 24 12:08:47 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 24 Mar 2014 12:08:47 -0400 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: References: <53304FFC.2070801@gmail.com> Message-ID: <5330588F.7070505@gmail.com> > On Mon, Mar 24, 2014 at 11:32 AM, Alan G Isaac wrote: >> I'm wondering if `sort` intentionally does not accept >> a `key` >> or if this is just a missing feature? On 3/24/2014 11:47 AM, Alexander Belopolsky wrote: > It would be very inefficient to call a key function on > every element compared during the sort. See np.argsort > and np.lexsort for faster alternatives. But the keys could be as in `lexsort`. I am currently using `argsort`, but I can do so because I don't need a lexicographically determined sort order for the indexes. To close with a related question: what is the preferred idiom for a descending sort? Thanks, Alan From josef.pktd at gmail.com Mon Mar 24 12:13:02 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 24 Mar 2014 12:13:02 -0400 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: <5330588F.7070505@gmail.com> References: <53304FFC.2070801@gmail.com> <5330588F.7070505@gmail.com> Message-ID: On Mon, Mar 24, 2014 at 12:08 PM, Alan G Isaac wrote: >> On Mon, Mar 24, 2014 at 11:32 AM, Alan G Isaac wrote: >>> I'm wondering if `sort` intentionally does not accept >>> a `key` >>> or if this is just a missing feature? > > > On 3/24/2014 11:47 AM, Alexander Belopolsky wrote: >> It would be very inefficient to call a key function on >> every element compared during the sort. See np.argsort >> and np.lexsort for faster alternatives. > > > But the keys could be as in `lexsort`. > > I am currently using `argsort`, but I can do so because > I don't need a lexicographically determined sort order > for the indexes. > > > To close with a related question: > what is the preferred idiom for a descending sort? adding [::-1] just creates a new view, pretty low cost. Josef > > Thanks, > Alan > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From matt at pagan.io Mon Mar 24 12:51:15 2014 From: matt at pagan.io (Matt Pagan) Date: Mon, 24 Mar 2014 16:51:15 +0000 Subject: [Numpy-discussion] Implementing elementary matrices In-Reply-To: References: <532FA13B.2070909@pagan.io> Message-ID: <53306283.40909@pagan.io> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Eelco Hoogendoorn: > Sounds (marginally) useful; although elementary row/column > operations are in practice usually better implemented directly by > indexing rather than in an operator form. Though I can see a use > for the latter. > > My suggestion: its not a common enough operation to deserve a 4 > letter acronym (assuming those are good things in any context). A > full 'elementary' would be much preferable I think. > > Great. Is this mailing list the best place to put a revised version of my patch? Should I make a github pull request? - -- Matt Pagan matt at pagan.io PGP: 0xE9284418E360583C -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJTMGKDAAoJEOkoRBjjYFg8DXwQALE95t9SS8xsFD0PpO3SwZNQ v2SxcnzH123mcrq55zzzHGgh9OUz694fqky2thyiazhKf5sSVka1Gf4b6U06nXE7 OG7+i9qGZgAf6cLBItmPYp2F2y/azAdQNrcVlkQFfzN8Waw4t2sfzKRvtkzT9xfU olY8i2xRHyrOxY+aZ8spxt/uQtY4gHEZUjuSNBVmLfAJI7aFZuJNiqftTp0Ggg5O B8UMKCW3yC3DDLvoU8dClgWCnFVdWyvOpv11ND3bzAS/NC+KHBOTEKwa6aaI4yjf vUiADGisROkIZrt0JsesAdds0AhOb5B6LV5+4oO+g+h+0VcUhiCzLZBSsxiOTRzS nncEfKWkMMeJyj2lfeFrqi6DtfVj4/EgklanX3BMBQo3WC2C3KD4VgiwRN6IpxIP K3PSY90sX8/qoMAEeQRH+oLQg8okUCkiv8RJYD7edUPAeuanA/8sTFqgdvVQn2Uw QUcOyMDCs71hG7c0fvi5nZNgkrRYjR9dRwUepk+i1nUkhUTK/+fHpyYouQZA7ppC X5EEvdTZCydukpW7RlH1R3VVHmOV+XYMmlopHJEdKHcAG++OsxIH6vXDarp5kmvZ aqHmt6atcfwxwiDHDVMcPbpZBx4HxN2X7DzI9lT1PJ4aDdH+uRW0NPe4unDno/K9 6z4se9ggSxRB7XsNkvPU =bqLu -----END PGP SIGNATURE----- From alan.isaac at gmail.com Mon Mar 24 13:05:21 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 24 Mar 2014 13:05:21 -0400 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: References: <53304FFC.2070801@gmail.com> <5330588F.7070505@gmail.com> Message-ID: <533065D1.4010701@gmail.com> > On Mon, Mar 24, 2014 at 12:08 PM, Alan G Isaac >> what is the preferred idiom for a descending sort? On 3/24/2014 12:13 PM, josef.pktd at gmail.com wrote: > adding [::-1] just creates a new view, pretty low cost. I meant when you need to sort on a key (another vector). Currently I'm just reversing the result of argsort with [::-1] but this changes the sort order (relative to a stable descending sort). Alan Isaac From charlesr.harris at gmail.com Mon Mar 24 13:41:12 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2014 11:41:12 -0600 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: <533065D1.4010701@gmail.com> References: <53304FFC.2070801@gmail.com> <5330588F.7070505@gmail.com> <533065D1.4010701@gmail.com> Message-ID: On Mon, Mar 24, 2014 at 11:05 AM, Alan G Isaac wrote: > > On Mon, Mar 24, 2014 at 12:08 PM, Alan G Isaac > >> what is the preferred idiom for a descending sort? > > > On 3/24/2014 12:13 PM, josef.pktd at gmail.com wrote: > > adding [::-1] just creates a new view, pretty low cost. > > > I meant when you need to sort on a key (another vector). > > Currently I'm just reversing the result of argsort with [::-1] > but this changes the sort order (relative to a stable descending sort). > > For integer types you can use the complement as the key In [9]: ~arange(4, dtype=uint8) Out[9]: array([255, 254, 253, 252], dtype=uint8) In [10]: ~arange(4, dtype=int8) Out[10]: array([-1, -2, -3, -4], dtype=int8) For float types you would need to use the negative. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Mon Mar 24 13:57:27 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Mon, 24 Mar 2014 13:57:27 -0400 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: References: <53304FFC.2070801@gmail.com> <5330588F.7070505@gmail.com> <533065D1.4010701@gmail.com> Message-ID: <53307207.2070907@gmail.com> On 3/24/2014 1:41 PM, Charles R Harris wrote: > For float types you would need to use the negative. Yes, that's all I could come up with. So ... shd `sort` have a `reverse` option, like Python's builtin? Alan From charlesr.harris at gmail.com Mon Mar 24 14:58:09 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2014 12:58:09 -0600 Subject: [Numpy-discussion] why sort does not accept a key? In-Reply-To: <53307207.2070907@gmail.com> References: <53304FFC.2070801@gmail.com> <5330588F.7070505@gmail.com> <533065D1.4010701@gmail.com> <53307207.2070907@gmail.com> Message-ID: On Mon, Mar 24, 2014 at 11:57 AM, Alan G Isaac wrote: > On 3/24/2014 1:41 PM, Charles R Harris wrote: > > For float types you would need to use the negative. > > > Yes, that's all I could come up with. > So ... shd `sort` have a `reverse` option, > like Python's builtin? > > Well, it would double the number of sorts if we kept them efficient with efficient type specific compares. Alternatively, we could sort the reverse option using less efficient compare function calls. I think whether or not we add a reverse option should depend on how many want it. One potential problem would be that the sort function pointers are built into PyArray_ArrFuncs, and adding more might be problematic. That has been a continuing source of pain, for instance in adding the partition and binsearch functions, so it might be worth just expanding that structure. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.martinot-lagarde at m4x.org Mon Mar 24 15:40:36 2014 From: joseph.martinot-lagarde at m4x.org (Joseph Martinot-Lagarde) Date: Mon, 24 Mar 2014 20:40:36 +0100 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: <53308A34.3020508@m4x.org> Le 22/03/2014 19:13, Nathaniel Smith a ?crit : > Hi all, > > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. > > The fundamental question is whether a chain like (a @ b @ c) should be > evaluated left-to-right (left-associativity) or right-to-left > (right-associativity). > > DATA SOURCE 1: > > This isn't a democratic vote, but it's useful to get a sense of > people's intuitions. Counting messages in the other thread, opinion > seems to be pretty evenly split: > > == "Votes" for right-associativity == > Weak-right: [2] [3] [5] > Tight-right: [4] [6] > Same-right: [11] > > == "Votes" for left-associativity == > Same-left: [7] [8] [14] [15] [16] > Tight-left: [9] > Weak-left: [12] > > There's also the "grouping" option (described in [10]), but that's > received very little support (just [13]). > > DATA SOURCE 2: > > Several people have suggested that performance considerations mean > that right-to-left evaluation is more common in practice than > left-to-right evaluation. But, if we look at actual usage in Python > code, that's not what we find: when people call dot() in chains, then > they're about evenly split, and actually use the left-to-right, > left-associative order slightly more often than the right-to-left, > right-associative order: > http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069578.html > > DATA SOURCE 3: > > And if we look at other languages, then we find: > > == "Votes" for right-associativity == > > > == "Votes" for left-associativity == > Same-left: Matlab, Julia, IDL, GAUSS > Tight-left: R This is a very strong point. Lots of people come to python with a background and would be surprised if python behaves differently than other mainstream frameworks. I'll add that simpler is better, multiplications should behave the same way, and vote for same-left. > > And Mathematica uses the "grouping" approach. > > ARGUMENTS: > > The final outcome of this is that I need to write a piece of text that > says what our (at least rough) consensus is, and lays out the reasons. > So long as the "vote" is so evenly split, I can't really do this. But > I can imagine what the different pieces of text might look like. > > THE CASE FOR LEFT-ASSOCIATIVITY: > > If I were writing this text in favor of left-associativity, I'd point out: > > - "Special cases aren't special enough to break the rules". Every > single operator in Python besides ** is left-associative (and ** has > very compelling arguments for right associativity). @ does not have > similarly compelling arguments. If we were having this debate about > "*", then it'd probably be much more lopsided towards > left-associativity. So sure, there's something about @ that makes > right-associativity *more* appealing than for most other operators. > But not *much* more appealing -- left-associativity still comes out at > least slightly ahead in all of the above measures. And there are a lot > of benefits to avoiding special cases -- it gives fewer rules to > memorize, fewer rules to remember, etc. So @ may be a special case, > but it's not special enough. > > - Other languages with @ operators almost overwhelmingly use the > "same-left" rule, and I've never heard anyone complain about this, so > clearly nothing horrible will happen if we go this way. We have no > comparable experience for right-associativity. > > - Given left-associativity, then there's good agreement about the > appropriate precedence. If we choose right-associativity then it's > much less clear (which will then make things harder for experts to > remember, harder for non-experts to guess, etc.). Note that one of the > votes for right-associativity even preferred the "same-right" rule, > which is not even technically possible... > > This strikes me as a nice solid case. > > THE CASE FOR RIGHT-ASSOCIATIVITY: > > If I were writing this text in favor of right-associativity, I'd point out: > > - Because matrix multiplication has a tight conceptual association > with function application/composition, many mathematically > sophisticated users have an intuition that a matrix expression like > R S x > proceeds from right-to-left, with first S transforming x, and then R > transforming the result. This isn't universally agreed, but at the > least this intuition is more common than for other operations like 2 * > 3 * 4 that everyone reads as going from left-to-right. > > - There might be some speed argument, if people often write things > like "Mat @ Mat @ vec"? But no-one has found any evidence that people > actually do write such things often. > > - There's been discussion of how right-associativity might maybe > perhaps be nice for non-matmul applications? But I can't use those > arguments [17] [18]. > > - ...... I got nothin'. > > I am fine with any outcome here. (I'm actually listed under *both* > tight-right and same-left in the straw poll above ;-).) I'm totally > happy to go back to Guido et al and argue for right-associativity. BUT > if you all want me to do that then you need to give me some better > arguments to use :-). > > One way to do this might be to go through the ((a @ b) @ c) and (a @ > (b @ c)) examples I found (the scripts are available [19], and I can > help modify them to spit out more details), look at the actual code, > and demonstrate that the left-to-right ((a @ b) @ c) cases are mostly > ones where evaluation order doesn't matter (i.e., they could have just > as well been written the other way), and the right-to-left (a @ (b @ > c)) ones are ones where right-to-left really is better than > left-to-right. I have no idea if this is true, and it'll require some > reading of the surrounding code to figure out what the matrix shapes > are, but if it *is* true then at least that'd be something solid that > right-associativity advocates could point to. > > WHAT NOW: > > If seeing this summary laid out caused you to change your mind one way > or the other, then please reply and say so! > > If you think of some way to get more data that could favor one or the > other option (running some kind of usability experiment on people? > [20]), then please share! > > If you think of some other arguments in favor of left-associativity, > then please share! > > If you think of some other arguments in favor of right-associativity, > especially ones that are based on something besides your gut feeling, > then PLEASE PLEASE share! > > Thanks, > -n > > [1] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069444.html > [2] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069446.html > [3] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069450.html > [4] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069452.html > [5] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069455.html > [6] https://mail.python.org/pipermail/python-ideas/2014-March/027124.html > [7] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069512.html > [8] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069513.html > [9] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069467.html > [10] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069530.html > [11] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069537.html > [12] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069540.html > [13] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069571.html > [14] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069514.html > [15] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069531.html > [16] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069567.html > [17] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069527.html > [18] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069584.html > [19] https://gist.github.com/njsmith/9157645 > [20] http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069584.html > --- Ce courrier ?lectronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com From charlesr.harris at gmail.com Mon Mar 24 19:28:22 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2014 17:28:22 -0600 Subject: [Numpy-discussion] Drop support for Python 3.2? Message-ID: Hi All, The suggestion has been made the we drop Python 3.2 support in numpy 1.9 and scipy 0.15. The advantage, from my point of view, to supporting Python >= 3.3 is that the u'unicode' syntax is supported in 3.3 and this makes it easier to maintain compatibility with Python 2.6 and 2.7. However, it may be a bit early to make this move, so feedback is welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 24 19:34:56 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 25 Mar 2014 00:34:56 +0100 Subject: [Numpy-discussion] GSoC - what's next Message-ID: Hi all, Just a short update, now that the deadline for submitting GSoC proposals has passed. We received four proposals: 1. Leo Mao, "Numpy: Vector Math Library Integration" 2. Janani Padmanbhan, "SciPy/NumPy- enhancements in scipy.special (hyp2f1, sph_harm) " 3. Ankit Agrawal, "SciPy : Discrete Wavelet Transforms and related algorithms" 4. Richard Tsai, "SciPy: Rewrite and improve cluster package in Cython" In principle we have enough mentors for all these proposals, although it looks like I'll have to chase them a bit to sign up in Melange etc. We're going to have to rank these proposals in the next week or so and communicate our preferences to the PSF organizers. I'll be in touch with all potential mentors about that. The announcement by Google of which students were accepted will follow on April 21. Thanks to the four students who spent a lot of effort on creating solid proposals, and to all who helped them by giving feedback. Looks like it's going to be a productive summer! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Mon Mar 24 19:37:06 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 25 Mar 2014 00:37:06 +0100 Subject: [Numpy-discussion] Drop support for Python 3.2? In-Reply-To: References: Message-ID: <5330C1A2.2080605@googlemail.com> On 25.03.2014 00:28, Charles R Harris wrote: > Hi All, > > The suggestion has been made the we drop Python 3.2 support in numpy 1.9 > and scipy 0.15. The advantage, from my point of view, to supporting > Python >= 3.3 is that the u'unicode' syntax is supported in 3.3 and this > makes it easier to maintain compatibility with Python 2.6 and 2.7. > However, it may be a bit early to make this move, so feedback is welcome. > > Chuck > > I don't think we need to drop source compatibility in numpy, to my knowledge the missing u'' syntax is not a big issue in the numpy code. In case it does come up we can use a u() function like in six. python3.2 is still the default in Ubuntu 12.04 which is still supported for 3 years. While probably few people actually use it, it would at least simplify binary package backports if numpy continues to build with python3.2 But +1 on dropping the 3.2 binary builds, looking at the download numbers of 1.8.1rc1 there seems to be a relatively low demand, but lets check them again after the final release. Cheers, Julian From ralf.gommers at gmail.com Mon Mar 24 19:40:29 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 25 Mar 2014 00:40:29 +0100 Subject: [Numpy-discussion] Drop support for Python 3.2? In-Reply-To: <5330C1A2.2080605@googlemail.com> References: <5330C1A2.2080605@googlemail.com> Message-ID: On Tue, Mar 25, 2014 at 12:37 AM, Julian Taylor < jtaylor.debian at googlemail.com> wrote: > On 25.03.2014 00:28, Charles R Harris wrote: > > Hi All, > > > > The suggestion has been made the we drop Python 3.2 support in numpy 1.9 > > and scipy 0.15. The advantage, from my point of view, to supporting > > Python >= 3.3 is that the u'unicode' syntax is supported in 3.3 and this > > makes it easier to maintain compatibility with Python 2.6 and 2.7. > > However, it may be a bit early to make this move, so feedback is welcome. > > > > Chuck > > > > > > I don't think we need to drop source compatibility in numpy, to my > knowledge the missing u'' syntax is not a big issue in the numpy code. > In case it does come up we can use a u() function like in six. > python3.2 is still the default in Ubuntu 12.04 which is still supported > for 3 years. While probably few people actually use it, it would at > least simplify binary package backports if numpy continues to build with > python3.2 > > But +1 on dropping the 3.2 binary builds, looking at the download > numbers of 1.8.1rc1 there seems to be a relatively low demand, but lets > check them again after the final release. > Looking at SF download numbers, the first binaries to drop would actually be those for 2.6. It doesn't hurt much to keep them though. I would be in favor of dropping support for 3.2 if it turns out that there are specific issues for that Python version, like the QZ segfaults that Skipper ran into. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 24 19:56:33 2014 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 24 Mar 2014 23:56:33 +0000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: > After 88 emails we don't have a conclusion in the other thread (see > [1] for background). But we have to come to some conclusion or another > if we want @ to exist :-). So I'll summarize where the discussion > stands and let's see if we can find some way to resolve this. Response in this thread so far seems (AFAICT) to have pretty much converged on same-left. If you think that this would be terrible and there is some compelling argument against it, then please speak up! Otherwise, if no-one objects, then I'll go ahead in the next few days and put same-left into the PEP. -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From charlesr.harris at gmail.com Mon Mar 24 19:58:57 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2014 17:58:57 -0600 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Mon, Mar 24, 2014 at 5:56 PM, Nathaniel Smith wrote: > On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: > > After 88 emails we don't have a conclusion in the other thread (see > > [1] for background). But we have to come to some conclusion or another > > if we want @ to exist :-). So I'll summarize where the discussion > > stands and let's see if we can find some way to resolve this. > > Response in this thread so far seems (AFAICT) to have pretty much > converged on same-left. > > If you think that this would be terrible and there is some compelling > argument against it, then please speak up! Otherwise, if no-one > objects, then I'll go ahead in the next few days and put same-left > into the PEP. > I think we should take a close look at broadcasting before deciding on the precedence. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Mar 24 20:11:06 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 24 Mar 2014 18:11:06 -0600 Subject: [Numpy-discussion] GSoC - what's next In-Reply-To: References: Message-ID: On Mon, Mar 24, 2014 at 5:34 PM, Ralf Gommers wrote: > Hi all, > > Just a short update, now that the deadline for submitting GSoC proposals > has passed. We received four proposals: > > 1. Leo Mao, "Numpy: Vector Math Library Integration" > 2. Janani Padmanbhan, "SciPy/NumPy- enhancements in scipy.special (hyp2f1, > sph_harm) " > 3. Ankit Agrawal, "SciPy : Discrete Wavelet Transforms and related > algorithms" > 4. Richard Tsai, "SciPy: Rewrite and improve cluster package in Cython" > > In principle we have enough mentors for all these proposals, although it > looks like I'll have to chase them a bit to sign up in Melange etc. We're > going to have to rank these proposals in the next week or so and > communicate our preferences to the PSF organizers. I'll be in touch with > all potential mentors about that. The announcement by Google of which > students were accepted will follow on April 21. > > Thanks to the four students who spent a lot of effort on creating solid > proposals, and to all who helped them by giving feedback. Looks like it's > going to be a productive summer! > > I signed up to mentor Richard Tsai. I also signed up for Leo Mao, but I hope Julian Taylor will do the work ;) It is much more his area of expertise. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 24 20:33:23 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 25 Mar 2014 00:33:23 +0000 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Mon, Mar 24, 2014 at 11:58 PM, Charles R Harris wrote: > On Mon, Mar 24, 2014 at 5:56 PM, Nathaniel Smith wrote: >> >> On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: >> > After 88 emails we don't have a conclusion in the other thread (see >> > [1] for background). But we have to come to some conclusion or another >> > if we want @ to exist :-). So I'll summarize where the discussion >> > stands and let's see if we can find some way to resolve this. >> >> Response in this thread so far seems (AFAICT) to have pretty much >> converged on same-left. >> >> If you think that this would be terrible and there is some compelling >> argument against it, then please speak up! Otherwise, if no-one >> objects, then I'll go ahead in the next few days and put same-left >> into the PEP. > > > I think we should take a close look at broadcasting before deciding on the > precedence. Can you elaborate? Like what, concretely, do you think we need to do now? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From rmcgibbo at gmail.com Tue Mar 25 08:28:04 2014 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Tue, 25 Mar 2014 05:28:04 -0700 Subject: [Numpy-discussion] Any numpy core devs on gittip (or similar)? Message-ID: Hey, I've just been reading the discussionon python-dev about PEP466, the meta-theme of which is that building and maintaining mission-critical software is just really tough. I don't really use SSL from python (the major topic of that discussion), but I depend on numpy and the awesome scientific stack every day. Anyways, that conversation made me think about what we take for granted in terms of software tools. If any of the numpy core devs are on gittip or similar, I think many of us would be happy to pitch in a few bucks for beer money or the like. Keep up the awesome, -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjwilliams43 at gmail.com Tue Mar 25 17:13:04 2014 From: cjwilliams43 at gmail.com (Colin J. Williams) Date: Tue, 25 Mar 2014 17:13:04 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 83 In-Reply-To: References: Message-ID: <5331F160.4060905@gmail.com> On 25-Mar-2014 1:00 PM, numpy-discussion-request at scipy.org wrote: > Message: 3 Date: Mon, 24 Mar 2014 17:58:57 -0600 From: Charles R > Harris Subject: Re: [Numpy-discussion] > Resolving the associativity/precedence debate for @ To: Discussion of > Numerical Python Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" On Mon, Mar 24, 2014 at > 5:56 PM, Nathaniel Smith wrote: >> >On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: >>> > >After 88 emails we don't have a conclusion in the other thread (see >>> > >[1] for background). But we have to come to some conclusion or another >>> > >if we want @ to exist:-). So I'll summarize where the discussion >>> > >stands and let's see if we can find some way to resolve this. >> > >> >Response in this thread so far seems (AFAICT) to have pretty much >> >converged on same-left. >> > >> >If you think that this would be terrible and there is some compelling >> >argument against it, then please speak up! Otherwise, if no-one >> >objects, then I'll go ahead in the next few days and put same-left >> >into the PEP. >> > > I think we should take a close look at broadcasting before deciding on the > precedence. > > Chuck > -------------- next part -------------- > An HTML attachment was scrubbed... > URL:http://mail.scipy.org/pipermail/numpy-discussion/attachments/20140324/626e79be/attachment-0001.html > > ------------------------------ Perhaps a closer look at np.matrix is needed too. There has been no close exploration of the weaknesses perceived by Nathan in the Matrix class. Are any of these of substance? If so, what corrections would be needed? Would implementation of those changes be done readily. I would like to see a Vector class, as a specialization of Matrix. These would avoid the use of an additional operator which would only be used with numpy. Colin W. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Tue Mar 25 19:38:04 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Mar 2014 00:38:04 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release Message-ID: <5332135C.7040903@googlemail.com> Hello, I'm happy to announce the of Numpy 1.8.1. This is a bugfix only release supporting Python 2.6 - 2.7 and 3.2 - 3.4. More than 48 issues have been fixed, the most important issues are listed in the release notes: https://github.com/numpy/numpy/blob/maintenance/1.8.x/doc/release/1.8.1-notes.rst Compared to the last release candidate we have fixed a regression of the 1.8 series that prevented using some gufunc based linalg functions on larger matrices on 32 bit systems. This implied a few changes in the NDIter C-API which might expose insufficient checks for error conditions in third party applications. Please check the release notes for details. Source tarballs, windows installers and release notes can be found at https://sourceforge.net/projects/numpy/files/NumPy/1.8.1 Cheers, Julian Taylor -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From matthew.brett at gmail.com Tue Mar 25 23:47:53 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Tue, 25 Mar 2014 20:47:53 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: <5332135C.7040903@googlemail.com> References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Tue, Mar 25, 2014 at 4:38 PM, Julian Taylor wrote: > Hello, > > I'm happy to announce the of Numpy 1.8.1. > This is a bugfix only release supporting Python 2.6 - 2.7 and 3.2 - 3.4. > > More than 48 issues have been fixed, the most important issues are > listed in the release notes: > https://github.com/numpy/numpy/blob/maintenance/1.8.x/doc/release/1.8.1-notes.rst > > Compared to the last release candidate we have fixed a regression of the > 1.8 series that prevented using some gufunc based linalg functions on > larger matrices on 32 bit systems. This implied a few changes in the > NDIter C-API which might expose insufficient checks for error conditions > in third party applications. Please check the release notes for details. > > Source tarballs, windows installers and release notes can be found at > https://sourceforge.net/projects/numpy/files/NumPy/1.8.1 Thanks a lot for this. I've just posted OSX wheels for Pythons 2.7, 3.3, 3.4. It's a strange feeling doing this: $ pip install numpy Downloading/unpacking numpy Downloading numpy-1.8.1-cp27-none-macosx_10_6_intel.whl (3.6MB): 3.6MB downloaded Installing collected packages: numpy Successfully installed numpy Cleaning up... 5 seconds waiting on a home internet connection and a numpy install.... Nice. Cheers, Matthew From josef.pktd at gmail.com Wed Mar 26 07:21:45 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 26 Mar 2014 07:21:45 -0400 Subject: [Numpy-discussion] Resolving the associativity/precedence debate for @ In-Reply-To: References: Message-ID: On Mon, Mar 24, 2014 at 8:33 PM, Nathaniel Smith wrote: > On Mon, Mar 24, 2014 at 11:58 PM, Charles R Harris > wrote: >> On Mon, Mar 24, 2014 at 5:56 PM, Nathaniel Smith wrote: >>> >>> On Sat, Mar 22, 2014 at 6:13 PM, Nathaniel Smith wrote: >>> > After 88 emails we don't have a conclusion in the other thread (see >>> > [1] for background). But we have to come to some conclusion or another >>> > if we want @ to exist :-). So I'll summarize where the discussion >>> > stands and let's see if we can find some way to resolve this. >>> >>> Response in this thread so far seems (AFAICT) to have pretty much >>> converged on same-left. >>> >>> If you think that this would be terrible and there is some compelling >>> argument against it, then please speak up! Otherwise, if no-one >>> objects, then I'll go ahead in the next few days and put same-left >>> into the PEP. >> >> >> I think we should take a close look at broadcasting before deciding on the >> precedence. > > Can you elaborate? Like what, concretely, do you think we need to do now? ??? "In examples like this, parenthesizing the code aggressively to spell out the logic, not only to Stata but also to yourself and anybody else reading it, should cause no embarrassment. You need not assume knowledge of Stata's precedence rules that determine interpretation when several operators are used in one expression. More importantly, you may avoid some horrible little bugs." Nicholas J. Cox Trying to figure out what Stata is using: elementwise operations are just below their matrix version in operator precedence. But Stata came late to matrix algebra, and is definitely not like Matlab or Gauss, or numpy. Josef > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From alan.isaac at gmail.com Wed Mar 26 10:48:15 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 26 Mar 2014 10:48:15 -0400 Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 90, Issue 83 In-Reply-To: <5331F160.4060905@gmail.com> References: <5331F160.4060905@gmail.com> Message-ID: <5332E8AF.3080803@gmail.com> On 3/25/2014 5:13 PM, Colin J. Williams wrote: > avoid the use of an additional operator which would only be used with numpy. http://legacy.python.org/dev/peps/pep-0465/#but-isn-t-matrix-multiplication-a-pretty-niche-requirement Alan Isaac From olivier.grisel at ensta.org Wed Mar 26 11:27:56 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 26 Mar 2014 16:27:56 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi Carl, I installed Python 2.7.6 64 bits on a windows server instance from rackspace cloud and then ran get-pip.py and then could successfully install the numpy and scipy wheel packages from your google drive folder. I tested dot products and scipy.linalg.svd and they work as expected. Then I uncompressed your mingw toolchain in c:\mingw, put c:\mingw\bin in my PATH and tried to build the scikit-learn git master with it, however it fails with: building 'sklearn.__check_build._check_build' extension compiling C sources C compiler: gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes compile options: '-D__MSVCRT_VERSION__=0x0900 -Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python2 7\include -Ic:\Python27\PC -c' gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes -D__MSVCRT_VERSION__=0x0900 -Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python27\lib\site- packages\numpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c sklearn\__check_build\_check_build.c -o build\temp.win-amd64-2.7\Release\sklearn\__check_b uild\_check_build.o Found executable c:\mingw\bin\gcc.exe gcc -shared -Wl,-gc-sections -Wl,-s build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o -Lc:\Python27\libs -Lc:\Python27\PCbuild\amd64 -Lbuild \temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x3): undefined reference to `__imp__Py_NoneStruct' build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x1ca): undefined reference to `__imp__PyThreadState_Current' build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x405): undefined reference to `__imp_PyExc_ImportError' c:/mingw/bin/../lib/gcc/x86_64-w64-mingw32/4.8.2/../../../../x86_64-w64-mingw32/bin/ld.exe: build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build. o: bad reloc address 0x0 in section `.data' collect2.exe: error: ld returned 1 exit status error: Command "gcc -shared -Wl,-gc-sections -Wl,-s build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o -Lc:\Python27\libs -Lc:\Python27\PCbui ld\amd64 -Lbuild\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd" failed with exit status 1 Furthermore, when I try to introspect the blas information on this box I get: In [1]: import scipy C:\Python27\lib\site-packages\numpy\core\__init__.py:6: Warning: Numpy 64bit experimental build with Mingw-w64 and OpenBlas. Use with care. from . import multiarray OpenBLAS : Your OS does not support AVX instructions. OpenBLAS is using Barcelona kernels as a fallback, which may give poorer performance. In [2]: scipy.show_config() umfpack_info: NOT AVAILABLE lapack_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib'] language = f77 blas_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib'] language = f77 openblas_info: libraries = ['openblas', 'openblas'] library_dirs = ['D:/devel/mingw64static/x86_64-w64-mingw32/lib'] language = f77 blas_mkl_info: NOT AVAILABLE In [3]: from numpy.distutils.system_info import get_info In [4]: get_info('blas_opt') C:\Python27\lib\site-packages\numpy\distutils\system_info.py:576: UserWarning: Specified path D:/devel/mingw64static/x86_64-w64-mingw32/lib is invalid. warnings.warn('Specified path %s is invalid.' % d) C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1522: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1531: UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__) C:\Python27\lib\site-packages\numpy\distutils\system_info.py:1534: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__) Out[4]: {} Would it make sense to embed the blas and lapack header files as part of this numpy wheel and make numpy.distutils.system_info return the lib and include folder pointing to the embedded libopenblas.dll and header files so has to make third party libraries directly buildable against those? -- Olivier From charlesr.harris at gmail.com Wed Mar 26 11:56:54 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Mar 2014 09:56:54 -0600 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: On Tue, Mar 25, 2014 at 9:47 PM, Matthew Brett wrote: > Hi, > > On Tue, Mar 25, 2014 at 4:38 PM, Julian Taylor > wrote: > > Hello, > > > > I'm happy to announce the of Numpy 1.8.1. > > This is a bugfix only release supporting Python 2.6 - 2.7 and 3.2 - 3.4. > > > > More than 48 issues have been fixed, the most important issues are > > listed in the release notes: > > > https://github.com/numpy/numpy/blob/maintenance/1.8.x/doc/release/1.8.1-notes.rst > > > > Compared to the last release candidate we have fixed a regression of the > > 1.8 series that prevented using some gufunc based linalg functions on > > larger matrices on 32 bit systems. This implied a few changes in the > > NDIter C-API which might expose insufficient checks for error conditions > > in third party applications. Please check the release notes for details. > > > > Source tarballs, windows installers and release notes can be found at > > https://sourceforge.net/projects/numpy/files/NumPy/1.8.1 > > Thanks a lot for this. I've just posted OSX wheels for Pythons 2.7, 3.3, > 3.4. > > It's a strange feeling doing this: > > $ pip install numpy > Downloading/unpacking numpy > Downloading numpy-1.8.1-cp27-none-macosx_10_6_intel.whl (3.6MB): > 3.6MB downloaded > Installing collected packages: numpy > Successfully installed numpy > Cleaning up... > > 5 seconds waiting on a home internet connection and a numpy install.... > Nice. > > That's pretty neat. Now if we can get the windows versions to be as easy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Mar 26 14:34:39 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Mar 2014 19:34:39 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> Message-ID: <53331DBF.6020504@googlemail.com> On 26.03.2014 16:27, Olivier Grisel wrote: > Hi Carl, > > I installed Python 2.7.6 64 bits on a windows server instance from > rackspace cloud and then ran get-pip.py and then could successfully > install the numpy and scipy wheel packages from your google drive > folder. I tested dot products and scipy.linalg.svd and they work as > expected. > > > Would it make sense to embed the blas and lapack header files as part > of this numpy wheel and make numpy.distutils.system_info return the > lib and include folder pointing to the embedded libopenblas.dll and > header files so has to make third party libraries directly buildable > against those? > as for using openblas by default in binary builds, no. pthread openblas build is now fork safe which is great but it is still not reliable enough for a default. E.g. the current latest release 0.2.8 still has one crash bug on dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. git head has the former four fixed bug still has wrong results for cgemv. The not so old 0.2.8 also fixed whole bunch more crashes and wrong result issues (crashes on QR, uninitialized data use in dgemm, ...). None of the fixes received unit tests, so I am somewhat pessimistic that it will improve, especially as the maintainer is dissertating (is that the right word?) and most of the code is assembler code only few people can write (it is simply not required anymore, we have code generators and intrinsics for that). Openblas is great if you do not have the patience to build ATLAS and only use a restricted set of functionality and platforms you can easily test. Currently it is in my opinion not suitable for a general purpose library like numpy. I don't have any objections to adding get_info("openblas") if that does not work yet. Patches welcome. [0] https://github.com/xianyi/OpenBLAS/issues/304 [1] https://github.com/xianyi/OpenBLAS/issues/333 [2] https://github.com/xianyi/OpenBLAS/issues/340 From Slaunger at gmail.com Wed Mar 26 15:48:52 2014 From: Slaunger at gmail.com (Slaunger) Date: Wed, 26 Mar 2014 12:48:52 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? Message-ID: <1395863332202-37077.post@n7.nabble.com> I am working on solving a recent recreational mathematical problem on Project Euler . I have a solution, which works fine for small N up to 10^5 but it takes too long to compute for the actual problem, where N is of the order 2*10^7. The problem is nested loops, and I am hoping to avoid one level in the loop by using clever numpy magic (as I have done often before). However, there is one step, which I cannot figure out how to do using numpy operations alone, and I am hoping for some help The subproblem is that I have in principle k = 1, ..., N sets of boolean arrays f_k and g_k each of length N. For each k I need to find the number of elements i where both f_k[i] and g_k[i] are True and sum that up over all N values of k. A problem of the order 4*10^14 if I just do it brute force. This takes way too long (there is a one minute rule). However, after a lot of thinking and by using some properties of the f_k and g_k I have managed to construct using only pure numpy function and only a single loop over k, arrays f_k_changes_at g_k_changes_at which contain the indices i at which the functions change it boolean value from True to False or False to True. It so happens that the number of changes is only a small fraction of N, the fraction decreases with larger N, so the size of these changes_at arrays contains perhaps only 1000 elements instead of 10000000 for each k, a significant reduction of complexity. Now, my problem is to figure out for how many values of i both f_k and g_k are True given the changes_at arrays. As this may be a little hard to understand here is a specific example of how these arrays can look like for k = 2 and N = 150 f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] g_2_changes_at = [ 2 94 101 146 149] with the boundary condition that f_2[0] = g_2[0] = False Which expands to i f_2 g_2 f_2 and g_2 0 F F F 1 F F F <- 2 T T T <- 3 F T F 4 F T F ... 38 F T F <- 39 T T T 40 T T T <- 41 F T F 42 F T F ... 57 F T F <- 58 T T T <- 59 F T F 60 F T F ... 64 F T F <- 65 T T T <- 66 F T F ... 92 F T F <- 93 T T T <- 94 T F F ... 100 T F F <- 101 T T T <- 102 F T F ... 144 F T F <- 145 T T T <- 146 T F F 147 T F F 148 T F F <- 149 T T T <- With the sum of elements fulfilling the condition being (see arrows) (2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + (145 - 144) + (149 - 148) = 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9 So, is there a numpy recipe for doing the equivalent process without expanding it into the full arrays? I have tried looping over each element in the changes_at arrays and build up the sums, but that is too inefficient as I then have an inner for loop containing conditional branching code Thanks in advance, Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From deshpande.jaidev at gmail.com Wed Mar 26 16:09:18 2014 From: deshpande.jaidev at gmail.com (Jaidev Deshpande) Date: Thu, 27 Mar 2014 01:39:18 +0530 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395863332202-37077.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> Message-ID: On Thu, Mar 27, 2014 at 1:18 AM, Slaunger wrote: > I am working on solving a recent recreational mathematical problem on > Project Euler . I have a solution, which works > fine for small N up to 10^5 but it takes too long to compute for the actual > problem, where N is of the order 2*10^7. The problem is nested loops, and I > am hoping to avoid one level in the loop by using clever numpy magic (as I > have done often before). However, there is one step, which I cannot figure > out how to do using numpy operations alone, and I am hoping for some help > > The subproblem is that I have in principle k = 1, ..., N sets of boolean > arrays > f_k and g_k each of length N. > > For each k I need to find the number of elements i where both f_k[i] and > g_k[i] are True and sum that up over all N values of k. > > A problem of the order 4*10^14 if I just do it brute force. This takes way > too long (there is a one minute rule). > > However, after a lot of thinking and by using some properties of the f_k > and > g_k I have managed to construct using only pure numpy function and only a > single loop over k, arrays > > f_k_changes_at > g_k_changes_at > > which contain the indices i at which the functions change it boolean value > from True to False or False to True. > > It so happens that the number of changes is only a small fraction of N, the > fraction decreases with larger N, so the size of these changes_at arrays > contains perhaps only 1000 elements instead of 10000000 for each k, a > significant reduction of complexity. > > Now, my problem is to figure out for how many values of i both f_k and g_k > are True given the changes_at arrays. > > As this may be a little hard to understand here is a specific example of > how > these arrays can look like for k = 2 and N = 150 > > f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] > g_2_changes_at = [ 2 94 101 146 149] > > with the boundary condition that f_2[0] = g_2[0] = False > > Which expands to > i f_2 g_2 f_2 and g_2 > 0 F F F > 1 F F F <- > 2 T T T <- > 3 F T F > 4 F T F > ... > 38 F T F <- > 39 T T T > 40 T T T <- > 41 F T F > 42 F T F > ... > 57 F T F <- > 58 T T T <- > 59 F T F > 60 F T F > ... > 64 F T F <- > 65 T T T <- > 66 F T F > ... > 92 F T F <- > 93 T T T <- > 94 T F F > ... > 100 T F F <- > 101 T T T <- > 102 F T F > ... > 144 F T F <- > 145 T T T <- > 146 T F F > 147 T F F > 148 T F F <- > 149 T T T <- > > With the sum of elements fulfilling the condition being (see arrows) > > (2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + > (145 > - 144) + (149 - 148) = > 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9 > > So, is there a numpy recipe for doing the equivalent process without > expanding it into the full arrays? > > I have tried looping over each element in the changes_at arrays and build > up > the sums, but that is too inefficient as I then have an inner for loop > containing conditional branching code > > Thanks in advance, Slaunger > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Can you provide a link to the problem itself? -- JD -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Wed Mar 26 16:12:20 2014 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 26 Mar 2014 16:12:20 -0400 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395863332202-37077.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> Message-ID: On Wed, Mar 26, 2014 at 3:48 PM, Slaunger wrote: > I am working on solving a recent recreational mathematical problem on > Project Euler . I have a solution, which works > fine for small N up to 10^5 but it takes too long to compute for the actual > problem, where N is of the order 2*10^7. The problem is nested loops, and I > am hoping to avoid one level in the loop by using clever numpy magic (as I > have done often before). However, there is one step, which I cannot figure > out how to do using numpy operations alone, and I am hoping for some help > > The subproblem is that I have in principle k = 1, ..., N sets of boolean > arrays > f_k and g_k each of length N. > > For each k I need to find the number of elements i where both f_k[i] and > g_k[i] are True and sum that up over all N values of k. IIUC, [~/] [1]: np.logical_and([True, False, True], [False, False, True]) [1]: array([False, False, True], dtype=bool) You can avoid looping over k since they're all the same length [~/] [3]: np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]) [3]: array([[False, False], [False, True], [False, True]], dtype=bool) [~/] [4]: np.sum(np.logical_and([[True, False],[False, True],[False, True]], [[False, False], [False, True], [True, True]]), axis=0) [4]: array([0, 2]) > > A problem of the order 4*10^14 if I just do it brute force. This takes way > too long (there is a one minute rule). > > However, after a lot of thinking and by using some properties of the f_k and > g_k I have managed to construct using only pure numpy function and only a > single loop over k, arrays > > f_k_changes_at > g_k_changes_at > > which contain the indices i at which the functions change it boolean value > from True to False or False to True. > > It so happens that the number of changes is only a small fraction of N, the > fraction decreases with larger N, so the size of these changes_at arrays > contains perhaps only 1000 elements instead of 10000000 for each k, a > significant reduction of complexity. > > Now, my problem is to figure out for how many values of i both f_k and g_k > are True given the changes_at arrays. > > As this may be a little hard to understand here is a specific example of how > these arrays can look like for k = 2 and N = 150 > > f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] > g_2_changes_at = [ 2 94 101 146 149] > > with the boundary condition that f_2[0] = g_2[0] = False > > Which expands to > i f_2 g_2 f_2 and g_2 > 0 F F F > 1 F F F <- > 2 T T T <- > 3 F T F > 4 F T F > ... > 38 F T F <- > 39 T T T > 40 T T T <- > 41 F T F > 42 F T F > ... > 57 F T F <- > 58 T T T <- > 59 F T F > 60 F T F > ... > 64 F T F <- > 65 T T T <- > 66 F T F > ... > 92 F T F <- > 93 T T T <- > 94 T F F > ... > 100 T F F <- > 101 T T T <- > 102 F T F > ... > 144 F T F <- > 145 T T T <- > 146 T F F > 147 T F F > 148 T F F <- > 149 T T T <- > > With the sum of elements fulfilling the condition being (see arrows) > > (2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + (145 > - 144) + (149 - 148) = > 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9 > > So, is there a numpy recipe for doing the equivalent process without > expanding it into the full arrays? > > I have tried looping over each element in the changes_at arrays and build up > the sums, but that is too inefficient as I then have an inner for loop > containing conditional branching code > > Thanks in advance, Slaunger > > > > -- > View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Slaunger at gmail.com Wed Mar 26 16:20:18 2014 From: Slaunger at gmail.com (Slaunger) Date: Wed, 26 Mar 2014 13:20:18 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> Message-ID: <1395865218647-37080.post@n7.nabble.com> Jaidev Deshpande wrote > Can you provide a link to the problem itself? > > -- > JD I'd rather not state the problem number since it should not be so easy to search for it and find this thread, but I can state that at the the time being, it is the problem with the highest problem number (released this Saturday) -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077p37080.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From Slaunger at gmail.com Wed Mar 26 16:28:15 2014 From: Slaunger at gmail.com (Slaunger) Date: Wed, 26 Mar 2014 13:28:15 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> Message-ID: <1395865695422-37081.post@n7.nabble.com> jseabold wrote > IIUC, > > [~/] > [1]: np.logical_and([True, False, True], [False, False, True]) > [1]: array([False, False, True], dtype=bool) > > You can avoid looping over k since they're all the same length > > [~/] > [3]: np.logical_and([[True, False],[False, True],[False, True]], > [[False, False], [False, True], [True, True]]) > [3]: > array([[False, False], > [False, True], > [False, True]], dtype=bool) > > [~/] > [4]: np.sum(np.logical_and([[True, False],[False, True],[False, > True]], [[False, False], [False, True], [True, True]]), axis=0) > [4]: array([0, 2]) Well, yes, if you work with the pure f_k and g_k that is true, but this two-dimensional array will have 4*10^14 elements and will exhaust my memory. That is why I have found a more efficient method for finding only the much fewer changes_at elements for each k, and these arrays have unequal length, and has to be considered for eack k (which is tolerable as long as I avoid a further inner loop for each k in explicit Python). I could implement this in C and get it done sufficiently efficient. I just like to make a point in demonstrating this is also doable in finite time in Python/numpy. -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077p37081.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From njs at pobox.com Wed Mar 26 16:41:28 2014 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 26 Mar 2014 21:41:28 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <53331DBF.6020504@googlemail.com> References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> Message-ID: On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor wrote: > as for using openblas by default in binary builds, no. > pthread openblas build is now fork safe which is great but it is still > not reliable enough for a default. > E.g. the current latest release 0.2.8 still has one crash bug on > dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. > git head has the former four fixed bug still has wrong results for cgemv. > The not so old 0.2.8 also fixed whole bunch more crashes and wrong > result issues (crashes on QR, uninitialized data use in dgemm, ...). > None of the fixes received unit tests, so I am somewhat pessimistic that > it will improve, especially as the maintainer is dissertating (is that > the right word?) and most of the code is assembler code only few people > can write (it is simply not required anymore, we have code generators > and intrinsics for that). > > Openblas is great if you do not have the patience to build ATLAS and > only use a restricted set of functionality and platforms you can easily > test. > Currently it is in my opinion not suitable for a general purpose library > like numpy. Those problems you list are pretty damning, but neither is it reasonable to expect everyone to manually build ATLAS on every machine they use (or their students use, or...) :-(. So what other options do we have for general purpose builds? Give up and use MKL? How's eigen-blas doing these days? (I guess from skimming their docs they use OpenMP?) -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From jaime.frio at gmail.com Wed Mar 26 16:58:52 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Wed, 26 Mar 2014 13:58:52 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395865695422-37081.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> Message-ID: On Wed, Mar 26, 2014 at 1:28 PM, Slaunger wrote: See if you can make sense of the following. It is a little cryptic, but it works: f_change = np.array([2, 3, 39, 41, 58, 59, 65, 66, 93, 102, 145]) g_change = np.array([2, 94, 101, 146, 149]) N = 150 if len(f_change) % 2 : f_change = np.append(f_change, N) if len(g_change) % 2 : g_change = np.append(g_change, N) idx = np.searchsorted(f_change, g_change) f_change_exp = np.insert(np.insert(f_change, idx, g_change), idx + np.arange(len(idx)), g_change) idx2 = np.searchsorted(g_change, f_change_exp) f_change_lens = f_change_exp[1::2] - f_change_exp[::2] true_true_intervals = idx2[1::2] % 2 != 0 total = np.sum(f_change_lens[true_true_intervals]) >>> total 9 I'll gladly elaborate on what's going on, just ask! Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Wed Mar 26 17:08:11 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Mar 2014 22:08:11 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> Message-ID: <533341BB.1070003@googlemail.com> On 26.03.2014 21:41, Nathaniel Smith wrote: > On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor > wrote: >> as for using openblas by default in binary builds, no. >> pthread openblas build is now fork safe which is great but it is still >> not reliable enough for a default. >> E.g. the current latest release 0.2.8 still has one crash bug on >> dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. >> git head has the former four fixed bug still has wrong results for cgemv. >> The not so old 0.2.8 also fixed whole bunch more crashes and wrong >> result issues (crashes on QR, uninitialized data use in dgemm, ...). >> None of the fixes received unit tests, so I am somewhat pessimistic that >> it will improve, especially as the maintainer is dissertating (is that >> the right word?) and most of the code is assembler code only few people >> can write (it is simply not required anymore, we have code generators >> and intrinsics for that). >> >> Openblas is great if you do not have the patience to build ATLAS and >> only use a restricted set of functionality and platforms you can easily >> test. >> Currently it is in my opinion not suitable for a general purpose library >> like numpy. > > Those problems you list are pretty damning, but neither is it > reasonable to expect everyone to manually build ATLAS on every machine > they use (or their students use, or...) :-(. So what other options do > we have for general purpose builds? Give up and use MKL? How's > eigen-blas doing these days? (I guess from skimming their docs they > use OpenMP?) > I don't think general purpose builds need to have perfect performance. we should provide something that works and allow users to tune it when required. The slower general purpose build is also a great testcase to verify that the tuned build works for your problem. I didn't notice this is a reply to third party provided win64 binaries with openblas. I though it was about official numpy binaries with openblas again. Third party binaries using openblas are great, especially these that seem to warn that this is experimental. It helps ironing out the kinks of openblas. Thanks for providing them. From jsseabold at gmail.com Wed Mar 26 17:10:20 2014 From: jsseabold at gmail.com (Skipper Seabold) Date: Wed, 26 Mar 2014 17:10:20 -0400 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395865695422-37081.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> Message-ID: On Wed, Mar 26, 2014 at 4:28 PM, Slaunger wrote: > jseabold wrote >> IIUC, >> >> [~/] >> [1]: np.logical_and([True, False, True], [False, False, True]) >> [1]: array([False, False, True], dtype=bool) >> >> You can avoid looping over k since they're all the same length >> >> [~/] >> [3]: np.logical_and([[True, False],[False, True],[False, True]], >> [[False, False], [False, True], [True, True]]) >> [3]: >> array([[False, False], >> [False, True], >> [False, True]], dtype=bool) >> >> [~/] >> [4]: np.sum(np.logical_and([[True, False],[False, True],[False, >> True]], [[False, False], [False, True], [True, True]]), axis=0) >> [4]: array([0, 2]) > > Well, yes, if you work with the pure f_k and g_k that is true, but this > two-dimensional array will have 4*10^14 elements and will exhaust my memory. > > That is why I have found a more efficient method for finding only the much > fewer changes_at elements for each k, and these arrays have unequal length, > and has to be considered for eack k (which is tolerable as long as I avoid a > further inner loop for each k in explicit Python). > > I could implement this in C and get it done sufficiently efficient. I just > like to make a point in demonstrating this is also doable in finite time in > Python/numpy. > If you want to attack it straight on and keep it conceptually simple, this looks like it would work. Fair warning, I've never done this and have no idea if it's actually memory and computationally efficient, so I'd be interested to hear from experts. I just wanted to see if it would work from disk. I wonder if a solution using PyTables would be faster. Provided that you can chunk your data into a memmap array, then something you *could* do N = 2*10**7 chunk_size = 100000 farr1 = 'scratch/arr1' farr2 = 'scratch/arr2' arr1 = np.memmap(farr1, dtype='uint8', mode='w+', shape=(N, 4)) arr2 = np.memmap(farr2, dtype='uint8', mode='w+', shape=(N, 4)) for i in xrange(0, N, chunk_size): arr1[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8) arr2[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, 4)).astype(np.uint8) del arr1 del arr2 arr1 = np.memmap(farr1, mode='r', dtype='uint8', shape=(N,4)) arr2 = np.memmap(farr2, mode='r', dtype='uint8', shape=(N,4)) equal = np.logical_and(arr1[:chunk_size], arr2[:chunk_size]).sum(0) for i in xrange(chunk_size, N, chunk_size): equal += np.logical_and(arr1[i:i+chunk_size], arr2[i:i+chunk_size]).sum(0) Skipper From olivier.grisel at ensta.org Wed Mar 26 17:17:46 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 26 Mar 2014 22:17:46 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <533341BB.1070003@googlemail.com> References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> Message-ID: My understanding of Carl's effort is that the long term goal is to have official windows whl packages for both numpy and scipy published on PyPI with a builtin BLAS / LAPACK implementation so that users can do `pip install scipy` under windows and get something that just works without have to install any compiler (fortran or C) nor any additional library manually. Most windows users are beginners and you cannot really expect them to understand how to build the whole scipy stack from source. The current solution (executable setup installers) is not optimal as it requires Administrator rights to run, does not resolve dependencies as pip does and cannot be installed in virtualenvs. If we can build numpy / scipy whl packages for windows with the Atlas dlls then fine embedded in the numpy package then good. It does not need to be the fastest BLAS / LAPACK lib in my opinion. Just something that works. The problem with ATLAS is that you need to select the number of thread at build time AFAIK. But we could set it to a reasonable default (e.g. 4 threads) for the default windows package. -- Olivier From Slaunger at gmail.com Wed Mar 26 17:23:51 2014 From: Slaunger at gmail.com (Slaunger) Date: Wed, 26 Mar 2014 14:23:51 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> Message-ID: <1395869031131-37087.post@n7.nabble.com> Jaime Fern?ndez del R?o wrote > On Wed, Mar 26, 2014 at 1:28 PM, Slaunger < > Slaunger@ > > wrote: > > See if you can make sense of the following. It is a little cryptic, but it > works: > > f_change = np.array([2, 3, 39, 41, 58, 59, 65, 66, 93, 102, 145]) > > g_change = np.array([2, 94, 101, 146, 149]) > > N = 150 > > > if len(f_change) % 2 : > > f_change = np.append(f_change, N) > > > if len(g_change) % 2 : > > g_change = np.append(g_change, N) > > > idx = np.searchsorted(f_change, g_change) > > > f_change_exp = np.insert(np.insert(f_change, idx, g_change), > > idx + np.arange(len(idx)), g_change) > > > idx2 = np.searchsorted(g_change, f_change_exp) > > > f_change_lens = f_change_exp[1::2] - f_change_exp[::2] > > true_true_intervals = idx2[1::2] % 2 != 0 > > > total = np.sum(f_change_lens[true_true_intervals]) > > >>>> total > > 9 > > I'll gladly elaborate on what's going on, just ask! > > Jaime Hol? Jaime! YOU ARE A GENIUS!. I understand exactly what you are doing! You know what. I just had a shower, nothing like having a shower for thinking about hard problems, and I got the same idea: Make the changes_at arrays even length if needed by appending N, then use searchsorted to figure out what values to merge and where! Only I did not know about the append and insert methods. Very, very nice! (I only knew concatenate, which would be clumsy for just appending one element), and for the insert I would have concatenated and in-place sorted, but your solution is much more elegant! You saved my evening! Actually, my head has been spinning about this problem the last three evenings without having been able to nail it down. Abrazos de Dinamarca, Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077p37087.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From jtaylor.debian at googlemail.com Wed Mar 26 17:31:08 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Wed, 26 Mar 2014 22:31:08 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> Message-ID: <5333471C.802@googlemail.com> On 26.03.2014 22:17, Olivier Grisel wrote: > > The problem with ATLAS is that you need to select the number of thread > at build time AFAIK. But we could set it to a reasonable default (e.g. > 4 threads) for the default windows package. > You have to set the number of threads at build time with OpenBlas too. At runtime it then selects the number of online cpus but limited to the build time maximum. It defaults to the maximum of the machine it was built on. You need to explicitly override that for generic binaries. (I think debian binaries uses 64 which is probably reasonable for non MIC systems) From Slaunger at gmail.com Wed Mar 26 17:33:06 2014 From: Slaunger at gmail.com (Slaunger) Date: Wed, 26 Mar 2014 14:33:06 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> Message-ID: <1395869586103-37089.post@n7.nabble.com> jseabold wrote >> >> Well, yes, if you work with the pure f_k and g_k that is true, but this >> two-dimensional array will have 4*10^14 elements and will exhaust my >> memory. >> >> That is why I have found a more efficient method for finding only the >> much >> fewer changes_at elements for each k, and these arrays have unequal >> length, >> and has to be considered for eack k (which is tolerable as long as I >> avoid a >> further inner loop for each k in explicit Python). >> >> I could implement this in C and get it done sufficiently efficient. I >> just >> like to make a point in demonstrating this is also doable in finite time >> in >> Python/numpy. >> > > If you want to attack it straight on and keep it conceptually simple, > this looks like it would work. Fair warning, I've never done this and > have no idea if it's actually memory and computationally efficient, so > I'd be interested to hear from experts. I just wanted to see if it > would work from disk. I wonder if a solution using PyTables would be > faster. > > Provided that you can chunk your data into a memmap array, then > something you *could* do > > N = 2*10**7 > chunk_size = 100000 > > farr1 = 'scratch/arr1' > farr2 = 'scratch/arr2' > > arr1 = np.memmap(farr1, dtype='uint8', mode='w+', shape=(N, 4)) > arr2 = np.memmap(farr2, dtype='uint8', mode='w+', shape=(N, 4)) > > for i in xrange(0, N, chunk_size): > arr1[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, > 4)).astype(np.uint8) > arr2[i:i+chunk_size] = np.random.randint(2, size=(chunk_size, > 4)).astype(np.uint8) > > del arr1 > del arr2 > > arr1 = np.memmap(farr1, mode='r', dtype='uint8', shape=(N,4)) > arr2 = np.memmap(farr2, mode='r', dtype='uint8', shape=(N,4)) > > > equal = np.logical_and(arr1[:chunk_size], > arr2[:chunk_size]).sum(0) > > for i in xrange(chunk_size, N, chunk_size): > equal += np.logical_and(arr1[i:i+chunk_size], > arr2[i:i+chunk_size]).sum(0) > > Skipper Thanks for the proposal Skipper, I have used memmap before, and this may work, but still the number of elementary and operations needed (although hidden under the hood of chunked logical_and) will be about a factor of 1000 larger than what is actually needed due to the sparsity in the "roots" of the logical functions I actually have, and that will result in hours or days of computation instead of minute(s). I think I will first give it a go using the procedure described by Jaime (tomorrow, no more time today), as I have gone through a lot of pain constructing the changes_at arrays using a fast and efficient method. --Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077p37089.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From olivier.grisel at ensta.org Wed Mar 26 17:35:47 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Wed, 26 Mar 2014 22:35:47 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <5333471C.802@googlemail.com> References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> <5333471C.802@googlemail.com> Message-ID: 2014-03-26 22:31 GMT+01:00 Julian Taylor : > On 26.03.2014 22:17, Olivier Grisel wrote: >> >> The problem with ATLAS is that you need to select the number of thread >> at build time AFAIK. But we could set it to a reasonable default (e.g. >> 4 threads) for the default windows package. >> > > You have to set the number of threads at build time with OpenBlas too. > At runtime it then selects the number of online cpus but limited to the > build time maximum. > It defaults to the maximum of the machine it was built on. You need to > explicitly override that for generic binaries. (I think debian binaries > uses 64 which is probably reasonable for non MIC systems) Yes, the official windows binary for OpenBLAS is also build with a maximum number of threads of 64 (NUM_THREADS=64) which I find reasonable since the runtime number of threads will be capped by the actual number of cores. For ATLAS I don't think there is a runtime cap, or is there? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From jaime.frio at gmail.com Wed Mar 26 17:50:18 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Wed, 26 Mar 2014 14:50:18 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395869031131-37087.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> Message-ID: On Wed, Mar 26, 2014 at 2:23 PM, Slaunger wrote: > Jaime Fern?ndez del R?o wrote > > You saved my evening! Actually, my head has been spinning about this > problem > the last three evenings without having been able to nail it down. > I had to quit Project Euler about 5 years ago because it was taking a huge toll on my mental health. I did learn/remember a ton of math, but was staying up all night banging my head against the problems much too often. Every now and then I do peek back and sometimes attempt a problem or two, but try to stay away for my own good. If you want to be projecteuler friends, I'm jfrio over there... Jaime -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Mar 26 18:02:44 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 26 Mar 2014 15:02:44 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: On Wed, Mar 26, 2014 at 8:56 AM, Charles R Harris wrote: > > 5 seconds waiting on a home internet connection and a numpy install.... >> Nice. >> >> > That's pretty neat. Now if we can get the windows versions to be as easy. > > Indeed -- where are we on that? Wasn't there more or less a consensus to put up Windows Wheels with SSE2? Or did we decide that was going to break a few too many systems... I also recall that some folks were working with a new BLAS (OpenBLAS ? ) that might support multi-architecture binaries...that would be a great solution. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Wed Mar 26 18:09:37 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Wed, 26 Mar 2014 23:09:37 +0100 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> Message-ID: Without looking ahead, here is what I came up with; but I see more elegant solutions have been found already. import numpy as np def as_dense(f, length): i = np.zeros(length+1, np.int) i[f[0]] = 1 i[f[1]] = -1 return np.cumsum(i)[:-1] def as_sparse(d): diff = np.diff(np.concatenate(([0], d))) on, = np.nonzero(diff) on = on if on.size%2==0 else np.append(on, len(d)) return on.reshape(-1,2).T def join(f, g): on = np.sort(np.concatenate((f[0], g[0]))) off = np.sort(np.concatenate((f[1], g[1]))) I = np.argsort( np.concatenate((on, off)) ).argsort().reshape(2,-1) Q = -np.ones((2,I.size), np.int) Q[0,I[0]] = on Q[1,I[1]] = off idx_on = np.logical_and( Q[0,1:]*Q[0,:-1] < 0, Q[0,:-1]!=-1) idx_off = np.logical_and( Q[1,1:]*Q[1,:-1] < 0, Q[1,1:]!=-1) idx_on = np.concatenate( (idx_on, [False])) idx_off = np.concatenate( ([False], idx_off)) return np.array(( Q[0,idx_on], Q[1,idx_off])) length = 150 f_2_changes_at = np.array( [ 2 , 3 , 39, 41 , 58 , 59, 65 , 66 , 93 ,102, 145, length]) g_2_changes_at = np.array( [ 2 , 94 ,101, 146, 149, length]) f = f_2_changes_at.reshape(-1,2).T g = g_2_changes_at.reshape(-1,2).T dense_result = as_sparse( np.logical_and( as_dense(f, length), as_dense(g,length))) sparse_result = join(f,g) print np.allclose(dense_result, sparse_result) On Wed, Mar 26, 2014 at 10:50 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Wed, Mar 26, 2014 at 2:23 PM, Slaunger wrote: > >> Jaime Fern?ndez del R?o wrote >> >> You saved my evening! Actually, my head has been spinning about this >> problem >> the last three evenings without having been able to nail it down. >> > > I had to quit Project Euler about 5 years ago because it was taking a huge > toll on my mental health. I did learn/remember a ton of math, but was > staying up all night banging my head against the problems much too often. > Every now and then I do peek back and sometimes attempt a problem or two, > but try to stay away for my own good. > > If you want to be projecteuler friends, I'm jfrio over there... > > Jaime > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Mar 26 18:13:51 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 26 Mar 2014 15:13:51 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395869031131-37087.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> Message-ID: On Wed, Mar 26, 2014 at 2:23 PM, Slaunger wrote: > Only I did not know about the append and insert methods. Very, very nice! > (I > only knew concatenate, which would be clumsy for just appending one > element), > Sorry -- I dont have the time to actually figure out what you are doing, but::: note that numpy arrays are not re-sizable, so np.append() and np.insert() have to make a new array, and copy all the old data over. If you are appending one at a time, this can be pretty darn slow. I wrote a "grow_array" class once, it was a wrapper around a numpy array that pre-allocated extra data to make appending more efficient. It's kind of half-baked code now, but let me know if you are interested. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjhnson at gmail.com Wed Mar 26 19:22:24 2014 From: tjhnson at gmail.com (T J) Date: Wed, 26 Mar 2014 18:22:24 -0500 Subject: [Numpy-discussion] Missing Data Message-ID: What is the status of: https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst and of missing data in Numpy, more generally? Is np.ma.array still the "state-of-the-art" way to handle missing data? Or has something better and more comprehensive been put together? -------------- next part -------------- An HTML attachment was scrubbed... URL: From argriffi at ncsu.edu Wed Mar 26 19:43:30 2014 From: argriffi at ncsu.edu (alex) Date: Wed, 26 Mar 2014 19:43:30 -0400 Subject: [Numpy-discussion] Missing Data In-Reply-To: References: Message-ID: On Wed, Mar 26, 2014 at 7:22 PM, T J wrote: > What is the status of: > > https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst For what it's worth this NEP was written in 2011 by mwiebe who made 258 numpy commits in 2011, 1 in 2012, and 3 in 2014. According to github, in the last few hours alone mwiebe has made several commits to 'blaze' and 'dynd-python'. Here's the blog post explaining the vision for Continuum's 'blaze' project http://continuum.io/blog/blaze. Continuum seems to have been started in early 2012. From matthew.brett at gmail.com Wed Mar 26 19:48:25 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 Mar 2014 16:48:25 -0700 Subject: [Numpy-discussion] Windows wheels using MKL? Message-ID: Hi, Can I check what is stopping us building official numpy binary wheels for Windows using the Intel Math Kernel Library? * We'd need developer licenses, but those sound like they would be easy to come by * We'd have to add something to the license for the wheel on the lines of the Canopy license [1], derived from the MKL license [2] - is that a problem? Are there other problems for numpy? * I believe we would also need the Intel Fortran compiler when building 64-bit scipy with MSVC. Is that correct? If we have a license, is that a problem? If we did static linking to MKL for numpy and scipy, is there anything stopping us building wheels that would work for XP and above, for 32 and 64 bit? Maybe this is not the ideal solution, but perhaps it's the right thing to do for now? Cheers, Matthew [1] https://www.enthought.com/products/canopy/canopy-license/ [2] http://software.intel.com/en-us/articles/intel-software-development-products-license-agreement From matthew.brett at gmail.com Wed Mar 26 20:29:22 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 Mar 2014 17:29:22 -0700 Subject: [Numpy-discussion] Windows wheels using MKL? In-Reply-To: References: Message-ID: Hi, On Wed, Mar 26, 2014 at 4:48 PM, Matthew Brett wrote: > Hi, > > Can I check what is stopping us building official numpy binary wheels > for Windows using the Intel Math Kernel Library? > > * We'd need developer licenses, but those sound like they would be > easy to come by > * We'd have to add something to the license for the wheel on the lines > of the Canopy license [1], derived from the MKL license [2] - is that > a problem? > > Are there other problems for numpy? Talking with Fernando, we identified these as being the key problem clauses in the MKL license [1]: D. DISTRIBUTION: Distribution of the Redistributables is also subject to the following limitations: [snipped clauses] (iv) shall use a license agreement that prohibits disassembly and reverse engineering of the Redistributables, (v) shall indemnify, hold harmless, and defend Intel and its suppliers from and against any claims or lawsuits, including attorney's fees, that arise or result from your distribution of any product. The first is a problem that might conceivably be adequately solved by adding a paragraph to the Pypi page for numpy ("If you download and install the windows binaries, you also agree... ") and copying a new clause into the license in the installed tree. Maybe. The second looks like it would be very hard to deal with for open source project like us.... Cheers (sadly), Matthew [1] http://software.intel.com/en-us/articles/intel-software-development-products-license-agreement From charlesr.harris at gmail.com Wed Mar 26 21:22:22 2014 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 26 Mar 2014 19:22:22 -0600 Subject: [Numpy-discussion] Missing Data In-Reply-To: References: Message-ID: On Wed, Mar 26, 2014 at 5:43 PM, alex wrote: > On Wed, Mar 26, 2014 at 7:22 PM, T J wrote: > > What is the status of: > > > > https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst > > For what it's worth this NEP was written in 2011 by mwiebe who made > 258 numpy commits in 2011, 1 in 2012, and 3 in 2014. According to > github, in the last few hours alone mwiebe has made several commits to > 'blaze' and 'dynd-python'. Here's the blog post explaining the vision > for Continuum's 'blaze' project http://continuum.io/blog/blaze. > Continuum seems to have been started in early 2012. > It looks like blaze will have bit pattern missing values ala R. I don't know if there is going to be a masked array implementation. The NA code was taken out of Numpy because it was not possible to reach agreement that it did the right thing. Numpy.ma remains the only solution for bad data at this time. The code could probably use more love than it has gotten ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Wed Mar 26 22:28:47 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 26 Mar 2014 19:28:47 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Wed, Mar 26, 2014 at 3:02 PM, Chris Barker wrote: > On Wed, Mar 26, 2014 at 8:56 AM, Charles R Harris > wrote: >> >> >>> 5 seconds waiting on a home internet connection and a numpy install.... >>> Nice. >>> >> >> That's pretty neat. Now if we can get the windows versions to be as easy. >> > > Indeed -- where are we on that? Wasn't there more or less a consensus to put > up Windows Wheels with SSE2? > > Or did we decide that was going to break a few too many systems... > > I also recall that some folks were working with a new BLAS (OpenBLAS ? ) > that might support multi-architecture binaries...that would be a great > solution. In another conversation it looked as though OpenBLAS was not robust enough for a standard distribution. >From what Julian said elsewhere, we can completely rely on SSE2 being present for 64 bit, correct? [1] >From [2] it looks like Windows XP 64-bit is about 20 times less common than 32 bit XP, meaning likely something less than 2 percent of Windows users overall. So - can we build windows 64 bit SSE2 wheels for windows 7? With ATLAS for example? It sounds like they would be fairly safe. Cheers, Matthew [1] http://en.wikipedia.org/wiki/X86-64#Architectural_features [2] http://store.steampowered.com/hwsurvey?platform=pc From rays at blue-cove.com Wed Mar 26 23:39:12 2014 From: rays at blue-cove.com (RayS) Date: Wed, 26 Mar 2014 20:39:12 -0700 Subject: [Numpy-discussion] Windows wheels using MKL? Message-ID: <201403270339.s2R3dAEQ003173@blue-cove.com> I've often wondered the particulars of the MKL; I have licensed via Enthought and distributed compiled works to client(s), and often use C. Gohkle's distros myself. - Ray At 05:29 PM 3/26/2014, you wrote: >Hi, > >On Wed, Mar 26, 2014 at 4:48 PM, Matthew Brett > wrote: > > Hi, > > > > Can I check what is stopping us building official numpy binary wheels > > for Windows using the Intel Math Kernel Library? > > > > * We'd need developer licenses, but those sound like they would be > > easy to come by > > * We'd have to add something to the license for the wheel on the lines > > of the Canopy license [1], derived from the MKL license [2] - is that > > a problem? > > > > Are there other problems for numpy? > >Talking with Fernando, we identified these as being the key problem >clauses in the MKL license [1]: > > >D. DISTRIBUTION: Distribution of the Redistributables is also subject >to the following limitations: >[snipped clauses] > (iv) shall use a license agreement >that prohibits disassembly and reverse engineering of the >Redistributables, (v) shall indemnify, hold >harmless, and defend Intel and its suppliers from and against any >claims or lawsuits, including >attorney's fees, that arise or result from your distribution of any product. > > >The first is a problem that might conceivably be adequately solved by >adding a paragraph to the Pypi page for numpy ("If you download and >install the windows binaries, you also agree... ") and copying a new >clause into the license in the installed tree. Maybe. The second >looks like it would be very hard to deal with for open source project >like us.... > >Cheers (sadly), > >Matthew > >[1] >http://software.intel.com/en-us/articles/intel-software-development-products-license-agreement >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From Slaunger at gmail.com Thu Mar 27 03:02:38 2014 From: Slaunger at gmail.com (Slaunger) Date: Thu, 27 Mar 2014 00:02:38 -0700 (PDT) Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> Message-ID: <1395903758783-37102.post@n7.nabble.com> Chris Barker - NOAA Federal wrote > note that numpy arrays are not re-sizable, so np.append() and np.insert() > have to make a new array, and copy all the old data over. If you are > appending one at a time, this can be pretty darn slow. > > I wrote a "grow_array" class once, it was a wrapper around a numpy array > that pre-allocated extra data to make appending more efficient. It's kind > of half-baked code now, but let me know if you are interested. Hi Chris, Yes, it is a good point and I am aware of it. For some of these functions it would have been nice if i could have parsed a preallocated, properly sliced array to the functions, which i could then reuse in each iteration step. It is indeed the memory allocation which appear to take more time than the actual calculations. Still it is much faster to create a few arrays than to loop through a thousand individual elements in pure Python. Interesting with the grow_array class. I think that what I have for now is sufficient, but i will keep your offer in mind:) --Slaunger -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077p37102.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From robert.kern at gmail.com Thu Mar 27 06:18:47 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Mar 2014 10:18:47 +0000 Subject: [Numpy-discussion] Windows wheels using MKL? In-Reply-To: References: Message-ID: On Thu, Mar 27, 2014 at 12:29 AM, Matthew Brett wrote: > Hi, > > On Wed, Mar 26, 2014 at 4:48 PM, Matthew Brett wrote: >> Hi, >> >> Can I check what is stopping us building official numpy binary wheels >> for Windows using the Intel Math Kernel Library? >> >> * We'd need developer licenses, but those sound like they would be >> easy to come by >> * We'd have to add something to the license for the wheel on the lines >> of the Canopy license [1], derived from the MKL license [2] - is that >> a problem? >> >> Are there other problems for numpy? > > Talking with Fernando, we identified these as being the key problem > clauses in the MKL license [1]: > > > D. DISTRIBUTION: Distribution of the Redistributables is also subject > to the following limitations: > [snipped clauses] > (iv) shall use a license agreement > that prohibits disassembly and reverse engineering of the > Redistributables, (v) shall indemnify, hold > harmless, and defend Intel and its suppliers from and against any > claims or lawsuits, including > attorney's fees, that arise or result from your distribution of any product. > > > The first is a problem that might conceivably be adequately solved by > adding a paragraph to the Pypi page for numpy ("If you download and > install the windows binaries, you also agree... ") and copying a new > clause into the license in the installed tree. Maybe. The second > looks like it would be very hard to deal with for open source project > like us.... It would be confusing to distribute these non-BSD wheels on the same PyPI page that declares most prominently that numpy is BSD-licensed. Adding some text elsewhere on the PyPI page is not going to help very much: people look at the "License: BSD" first and foremost. Nothing stops anyone else from building and distributing MKL-built binaries, a la C. Gohlke, but I don't think it is wise to do so on the PyPI page. -- Robert Kern From olivier.grisel at ensta.org Thu Mar 27 07:44:53 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 27 Mar 2014 12:44:53 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> Message-ID: 2014-03-26 16:27 GMT+01:00 Olivier Grisel : > Hi Carl, > > I installed Python 2.7.6 64 bits on a windows server instance from > rackspace cloud and then ran get-pip.py and then could successfully > install the numpy and scipy wheel packages from your google drive > folder. I tested dot products and scipy.linalg.svd and they work as > expected. > > Then I uncompressed your mingw toolchain in c:\mingw, put c:\mingw\bin > in my PATH and tried to build the scikit-learn git master with it, > however it fails with: > > building 'sklearn.__check_build._check_build' extension > compiling C sources > C compiler: gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes > > compile options: '-D__MSVCRT_VERSION__=0x0900 > -Ic:\Python27\lib\site-packages\numpy\core\include > -Ic:\Python27\lib\site-packages\numpy\core\include -Ic:\Python2 > 7\include -Ic:\Python27\PC -c' > gcc -DMS_WIN64 -O2 -msse -msse2 -Wall -Wstrict-prototypes > -D__MSVCRT_VERSION__=0x0900 > -Ic:\Python27\lib\site-packages\numpy\core\include > -Ic:\Python27\lib\site- > packages\numpy\core\include -Ic:\Python27\include -Ic:\Python27\PC -c > sklearn\__check_build\_check_build.c -o > build\temp.win-amd64-2.7\Release\sklearn\__check_b > uild\_check_build.o > Found executable c:\mingw\bin\gcc.exe > gcc -shared -Wl,-gc-sections -Wl,-s > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o > -Lc:\Python27\libs -Lc:\Python27\PCbuild\amd64 -Lbuild > \temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o > build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x3): > undefined reference to `__imp__Py_NoneStruct' > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x1ca): > undefined reference to `__imp__PyThreadState_Current' > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o:_check_build.c:(.text+0x405): > undefined reference to `__imp_PyExc_ImportError' > c:/mingw/bin/../lib/gcc/x86_64-w64-mingw32/4.8.2/../../../../x86_64-w64-mingw32/bin/ld.exe: > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build. > o: bad reloc address 0x0 in section `.data' > collect2.exe: error: ld returned 1 exit status > error: Command "gcc -shared -Wl,-gc-sections -Wl,-s > build\temp.win-amd64-2.7\Release\sklearn\__check_build\_check_build.o > -Lc:\Python27\libs -Lc:\Python27\PCbui > ld\amd64 -Lbuild\temp.win-amd64-2.7 -lpython27 -lmsvcr90 -o > build\lib.win-amd64-2.7\sklearn\__check_build\_check_build.pyd" failed > with exit status 1 Ignore that, I had forgotten to copy the libpython17.a file in c:\Python27\libs on that instance. Building scikit-learn works with the static toolchain. I have failing tests but those are probably not related to the toolchain. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From josef.pktd at gmail.com Thu Mar 27 09:55:52 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Mar 2014 09:55:52 -0400 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> Message-ID: On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel wrote: > My understanding of Carl's effort is that the long term goal is to > have official windows whl packages for both numpy and scipy published > on PyPI with a builtin BLAS / LAPACK implementation so that users can > do `pip install scipy` under windows and get something that just works > without have to install any compiler (fortran or C) nor any additional > library manually. > > Most windows users are beginners and you cannot really expect them to > understand how to build the whole scipy stack from source. > > The current solution (executable setup installers) is not optimal as > it requires Administrator rights to run, does not resolve dependencies > as pip does and cannot be installed in virtualenvs. as small related point: The official installers can be used to install in virtualenv The way I do it: Run the superpack, official installer, wait until it extracts the correct (SSE) install exe, then cancel Then easy_install the install exe file that has been extracted to the temp folder into the virtualenv. I don't remember if the extraction already requires admin rights, but I think not. easy_install doesn't require any, IIRC. Josef > > If we can build numpy / scipy whl packages for windows with the Atlas > dlls then fine embedded in the numpy package then good. It does not > need to be the > fastest BLAS / LAPACK lib in my opinion. Just something that works. > > The problem with ATLAS is that you need to select the number of thread > at build time AFAIK. But we could set it to a reasonable default (e.g. > 4 threads) for the default windows package. > > -- > Olivier > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From olivier.grisel at ensta.org Thu Mar 27 09:59:26 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Thu, 27 Mar 2014 14:59:26 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> Message-ID: 2014-03-27 14:55 GMT+01:00 : > On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel > wrote: >> My understanding of Carl's effort is that the long term goal is to >> have official windows whl packages for both numpy and scipy published >> on PyPI with a builtin BLAS / LAPACK implementation so that users can >> do `pip install scipy` under windows and get something that just works >> without have to install any compiler (fortran or C) nor any additional >> library manually. >> >> Most windows users are beginners and you cannot really expect them to >> understand how to build the whole scipy stack from source. >> >> The current solution (executable setup installers) is not optimal as >> it requires Administrator rights to run, does not resolve dependencies >> as pip does and cannot be installed in virtualenvs. > > as small related point: > > The official installers can be used to install in virtualenv > The way I do it: > Run the superpack, official installer, wait until it extracts the > correct (SSE) install exe, then cancel > Then easy_install the install exe file that has been extracted to the > temp folder into the virtualenv. > > I don't remember if the extraction already requires admin rights, but > I think not. > easy_install doesn't require any, IIRC. Hackish but interesting. Maybe the extraction can be done with generic tools like winzip? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From josef.pktd at gmail.com Thu Mar 27 10:13:29 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 27 Mar 2014 10:13:29 -0400 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> <533341BB.1070003@googlemail.com> Message-ID: On Thu, Mar 27, 2014 at 9:59 AM, Olivier Grisel wrote: > 2014-03-27 14:55 GMT+01:00 : >> On Wed, Mar 26, 2014 at 5:17 PM, Olivier Grisel >> wrote: >>> My understanding of Carl's effort is that the long term goal is to >>> have official windows whl packages for both numpy and scipy published >>> on PyPI with a builtin BLAS / LAPACK implementation so that users can >>> do `pip install scipy` under windows and get something that just works >>> without have to install any compiler (fortran or C) nor any additional >>> library manually. >>> >>> Most windows users are beginners and you cannot really expect them to >>> understand how to build the whole scipy stack from source. >>> >>> The current solution (executable setup installers) is not optimal as >>> it requires Administrator rights to run, does not resolve dependencies >>> as pip does and cannot be installed in virtualenvs. >> >> as small related point: >> >> The official installers can be used to install in virtualenv >> The way I do it: >> Run the superpack, official installer, wait until it extracts the >> correct (SSE) install exe, then cancel >> Then easy_install the install exe file that has been extracted to the >> temp folder into the virtualenv. >> >> I don't remember if the extraction already requires admin rights, but >> I think not. >> easy_install doesn't require any, IIRC. > > Hackish but interesting. Maybe the extraction can be done with generic > tools like winzip? I tried to open and unzip with WinRAR but couldn't make sense of the content. BTW: easy_install for other installers like matplotlib also works nicely for virtualenv ---- However, the official installers are only for 32-bit python, and I appreciate all the efforts to "modernize" the numpy and scipy builds. Josef > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From rays at blue-cove.com Thu Mar 27 10:42:00 2014 From: rays at blue-cove.com (RayS) Date: Thu, 27 Mar 2014 07:42:00 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <1395903758783-37102.post@n7.nabble.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> Message-ID: <201403271442.s2REg37v025016@blue-cove.com> I find this interesting, since I work with medical data sets of 100s of MB, and regularly run into memory allocation problems when doing a lot of Fourrier analysis, waterfalls etc. The per-process limit seems to be about 1.3GB on this 6GB quad-i7 with Win7. For live data collection routines I simply creates zeros() of say 300MB and trim the array when saving to disk. memmaps are also limited to RAM, and take a looooong time to create (seconds). So, I've been investigating Pandas and segmentaxis - just a bit so far. - Ray Schumacher At 12:02 AM 3/27/2014, you wrote: >Chris Barker - NOAA Federal wrote > > note that numpy arrays are not re-sizable, so np.append() and np.insert() > > have to make a new array, and copy all the old data over. If you are > > appending one at a time, this can be pretty darn slow. > > > > I wrote a "grow_array" class once, it was a wrapper around a numpy array > > that pre-allocated extra data to make appending more efficient. It's kind > > of half-baked code now, but let me know if you are interested. > >Hi Chris, > >Yes, it is a good point and I am aware of it. For some of these functions it >would have been nice if i could have parsed a preallocated, properly sliced >array to the functions, which i could then reuse in each iteration step. > >It is indeed the memory allocation which appear to take more time than the >actual calculations. > >Still it is much faster to create a few arrays than to loop through a >thousand individual elements in pure Python. > >Interesting with the grow_array class. I think that what I have for now is >sufficient, but i will keep your offer in mind:) > >--Slaunger From aaron.oleary at gmail.com Thu Mar 27 12:19:54 2014 From: aaron.oleary at gmail.com (Aaron O'Leary) Date: Thu, 27 Mar 2014 16:19:54 +0000 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <201403271442.s2REg37v025016@blue-cove.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> <201403271442.s2REg37v025016@blue-cove.com> Message-ID: <20140327161954.GA10845@tk422.wireless.leeds.ac.uk> You might want to look at hdf5 if you're routinely running out of ram. I'm using h5py with multi gigabyte data on an ssd right now. It is very fast. You still have to be careful with your computations and try to avoid creating copies though. hypy: www.h5py.org aaron On Thu 27 Mar, RayS wrote: > I find this interesting, since I work with medical data sets of 100s > of MB, and regularly run into memory allocation problems when doing a > lot of Fourrier analysis, waterfalls etc. The per-process limit seems > to be about 1.3GB on this 6GB quad-i7 with Win7. For live data > collection routines I simply creates zeros() of say 300MB and trim > the array when saving to disk. memmaps are also limited to RAM, and > take a looooong time to create (seconds). So, I've been investigating > Pandas and segmentaxis - just a bit so far. > > - Ray Schumacher > > > At 12:02 AM 3/27/2014, you wrote: > >Chris Barker - NOAA Federal wrote > > > note that numpy arrays are not re-sizable, so np.append() and np.insert() > > > have to make a new array, and copy all the old data over. If you are > > > appending one at a time, this can be pretty darn slow. > > > > > > I wrote a "grow_array" class once, it was a wrapper around a numpy array > > > that pre-allocated extra data to make appending more efficient. It's kind > > > of half-baked code now, but let me know if you are interested. > > > >Hi Chris, > > > >Yes, it is a good point and I am aware of it. For some of these functions it > >would have been nice if i could have parsed a preallocated, properly sliced > >array to the functions, which i could then reuse in each iteration step. > > > >It is indeed the memory allocation which appear to take more time than the > >actual calculations. > > > >Still it is much faster to create a few arrays than to loop through a > >thousand individual elements in pure Python. > > > >Interesting with the grow_array class. I think that what I have for now is > >sufficient, but i will keep your offer in mind:) > > > >--Slaunger > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From Jerome.Kieffer at esrf.fr Thu Mar 27 13:31:20 2014 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Thu, 27 Mar 2014 18:31:20 +0100 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <20140327161954.GA10845@tk422.wireless.leeds.ac.uk> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> <201403271442.s2REg37v025016@blue-cove.com> <20140327161954.GA10845@tk422.wireless.leeds.ac.uk> Message-ID: <20140327183120.ddcf1ab8.Jerome.Kieffer@esrf.fr> On Thu, 27 Mar 2014 16:19:54 +0000 "Aaron O'Leary" wrote: > > You might want to look at hdf5 if you're routinely running out of ram. > I'm using h5py with multi gigabyte data on an ssd right now. It is very > fast. You still have to be careful with your computations and try to > avoid creating copies though. Both for h5py and for memmapped files ... switching from windows to linux are likely to help ... -- J?r?me Kieffer tel +33 476 882 445 From matthew.brett at gmail.com Thu Mar 27 15:10:43 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 27 Mar 2014 12:10:43 -0700 Subject: [Numpy-discussion] Windows wheels using MKL? In-Reply-To: References: Message-ID: Hi, On Thu, Mar 27, 2014 at 3:18 AM, Robert Kern wrote: > On Thu, Mar 27, 2014 at 12:29 AM, Matthew Brett wrote: >> Hi, >> >> On Wed, Mar 26, 2014 at 4:48 PM, Matthew Brett wrote: >>> Hi, >>> >>> Can I check what is stopping us building official numpy binary wheels >>> for Windows using the Intel Math Kernel Library? >>> >>> * We'd need developer licenses, but those sound like they would be >>> easy to come by >>> * We'd have to add something to the license for the wheel on the lines >>> of the Canopy license [1], derived from the MKL license [2] - is that >>> a problem? >>> >>> Are there other problems for numpy? >> >> Talking with Fernando, we identified these as being the key problem >> clauses in the MKL license [1]: >> >> >> D. DISTRIBUTION: Distribution of the Redistributables is also subject >> to the following limitations: >> [snipped clauses] >> (iv) shall use a license agreement >> that prohibits disassembly and reverse engineering of the >> Redistributables, (v) shall indemnify, hold >> harmless, and defend Intel and its suppliers from and against any >> claims or lawsuits, including >> attorney's fees, that arise or result from your distribution of any product. >> >> >> The first is a problem that might conceivably be adequately solved by >> adding a paragraph to the Pypi page for numpy ("If you download and >> install the windows binaries, you also agree... ") and copying a new >> clause into the license in the installed tree. Maybe. The second >> looks like it would be very hard to deal with for open source project >> like us.... > > It would be confusing to distribute these non-BSD wheels on the same > PyPI page that declares most prominently that numpy is BSD-licensed. > Adding some text elsewhere on the PyPI page is not going to help very > much: people look at the "License: BSD" first and foremost. Nothing > stops anyone else from building and distributing MKL-built binaries, a > la C. Gohlke, but I don't think it is wise to do so on the PyPI page. Can you see any circumstances in which we could use the MKL binaries from pypi? Cheers, Matthew From matthew.brett at gmail.com Thu Mar 27 15:18:59 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 27 Mar 2014 12:18:59 -0700 Subject: [Numpy-discussion] Windows wheels using MKL? In-Reply-To: References: Message-ID: Hi, On Thu, Mar 27, 2014 at 12:10 PM, Matthew Brett wrote: > Hi, > > On Thu, Mar 27, 2014 at 3:18 AM, Robert Kern wrote: >> On Thu, Mar 27, 2014 at 12:29 AM, Matthew Brett wrote: >>> Hi, >>> >>> On Wed, Mar 26, 2014 at 4:48 PM, Matthew Brett wrote: >>>> Hi, >>>> >>>> Can I check what is stopping us building official numpy binary wheels >>>> for Windows using the Intel Math Kernel Library? >>>> >>>> * We'd need developer licenses, but those sound like they would be >>>> easy to come by >>>> * We'd have to add something to the license for the wheel on the lines >>>> of the Canopy license [1], derived from the MKL license [2] - is that >>>> a problem? >>>> >>>> Are there other problems for numpy? >>> >>> Talking with Fernando, we identified these as being the key problem >>> clauses in the MKL license [1]: >>> >>> >>> D. DISTRIBUTION: Distribution of the Redistributables is also subject >>> to the following limitations: >>> [snipped clauses] >>> (iv) shall use a license agreement >>> that prohibits disassembly and reverse engineering of the >>> Redistributables, (v) shall indemnify, hold >>> harmless, and defend Intel and its suppliers from and against any >>> claims or lawsuits, including >>> attorney's fees, that arise or result from your distribution of any product. >>> >>> >>> The first is a problem that might conceivably be adequately solved by >>> adding a paragraph to the Pypi page for numpy ("If you download and >>> install the windows binaries, you also agree... ") and copying a new >>> clause into the license in the installed tree. Maybe. The second >>> looks like it would be very hard to deal with for open source project >>> like us.... >> >> It would be confusing to distribute these non-BSD wheels on the same >> PyPI page that declares most prominently that numpy is BSD-licensed. >> Adding some text elsewhere on the PyPI page is not going to help very >> much: people look at the "License: BSD" first and foremost. Nothing >> stops anyone else from building and distributing MKL-built binaries, a >> la C. Gohlke, but I don't think it is wise to do so on the PyPI page. > > Can you see any circumstances in which we could use the MKL binaries from pypi? Christoph - have you considered building binary wheels for the projects you support? If not, is there any help I / we can give? Cheers, Matthew From chris.barker at noaa.gov Thu Mar 27 15:31:35 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 27 Mar 2014 12:31:35 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <201403271442.s2REg37v025016@blue-cove.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> <201403271442.s2REg37v025016@blue-cove.com> Message-ID: On Thu, Mar 27, 2014 at 7:42 AM, RayS wrote: > I find this interesting, since I work with medical data sets of 100s > of MB, and regularly run into memory allocation problems when doing a > lot of Fourrier analysis, waterfalls etc. The per-process limit seems > to be about 1.3GB on this 6GB quad-i7 with Win7. This sounds like 32 bit -- have you tried a 64 bit Python_numpy? Nt that you wont have issues anyway, but you should be abel to do better than 1.3GB... > memmaps are also limited to RAM, I don't think so, no -- but are limited to 2GB (I think) if you're using a 32 bit process There is also a compressed array package out there -- I can't remember what it's called -- but if you have large compressible arrays -- that might help. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From rays at blue-cove.com Thu Mar 27 16:00:55 2014 From: rays at blue-cove.com (RayS) Date: Thu, 27 Mar 2014 13:00:55 -0700 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> <201403271442.s2REg37v025016@blue-cove.com> Message-ID: <201403272000.s2RK0tmY026171@blue-cove.com> Thanks for all of the suggestions; we are migrating to 64bit Python soon as well. The environments are Win7 and Mac Maverics. carray sounds like what you said Chris - more I just found at http://kmike.ru/python-data-structures/ - Ray Schumacher At 12:31 PM 3/27/2014, you wrote: >On Thu, Mar 27, 2014 at 7:42 AM, RayS ><rays at blue-cove.com> wrote: >I find this interesting, since I work with medical data sets of 100s >of MB, and regularly run into memory allocation problems when doing a >lot of Fourrier analysis, waterfalls etc. The per-process limit seems >to be about 1.3GB on this 6GB quad-i7 with Win7. > > >This sounds like 32 bit -- have you tried a 64 >bit Python_numpy? Nt that you wont have issues >anyway, but you should be abel to do better than 1.3GB... >? >? memmaps are also limited to RAM, > > >I don't think so, no -- but are limited to 2GB >(I think) ? if you're using a 32 bit process > >There is also a compressed array package out >there -- I can't remember what it's called -- >but if you have large? compressible? arrays -- that might help. >? >-CHB > > >-- > >Christopher Barker, Ph.D. >Oceanographer > >Emergency Response Division >NOAA/NOS/OR&R ? ? ? ? ? ? (206) 526-6959? ? voice >7600 Sand Point Way NE ? ? (206) 526-6329? ? fax >Seattle, WA ? 98115 ? ? ? ? (206) 526-6317? ? main reception > >Chris.Barker at noaa.gov >_______________________________________________ >NumPy-Discussion mailing list >NumPy-Discussion at scipy.org >http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.goodman at colostate.edu Thu Mar 27 16:30:27 2014 From: alex.goodman at colostate.edu (Alex Goodman) Date: Thu, 27 Mar 2014 14:30:27 -0600 Subject: [Numpy-discussion] f2py links extensions to incorrect python installation on OSX / Anaconda Message-ID: Hi all, I have used f2py in the past on a Linux machine with virtually no issues. However on my Mac, I get the following error when importing an f2py generated extension: Fatal Python error: PyThreadState_Get: no current thread Abort trap: 6 After doing some research I found out that the extension is linked to the wrong python installation: otool -L add.so add.so: ./add.so (compatibility version 0.0.0, current version 0.0.0) /System/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.2) /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0) This seems odd because I am using the f2py executable included in Anaconda 1.9.1. I can easily fix this problem by manually using install_name_tool -change on the extension to link the correct library location, but this is really cumbersome. Is there an alternative solution, such as an additional command-line argument when invoking f2py? For what it is worth, I am also using Version 14.0.2 of the Intel Fortran Compiler. Thanks, Alex -- Alex Goodman Graduate Research Assistant Department of Atmospheric Science Colorado State University -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu Mar 27 16:50:03 2014 From: cournape at gmail.com (David Cournapeau) Date: Thu, 27 Mar 2014 20:50:03 +0000 Subject: [Numpy-discussion] f2py links extensions to incorrect python installation on OSX / Anaconda In-Reply-To: References: Message-ID: On Thu, Mar 27, 2014 at 8:30 PM, Alex Goodman wrote: > Hi all, > > I have used f2py in the past on a Linux machine with virtually no issues. > However on my Mac, I get the following error when importing an f2py > generated extension: > > Fatal Python error: PyThreadState_Get: no current thread > Abort trap: 6 > > After doing some research I found out that the extension is linked to the > wrong python installation: > otool -L add.so > add.so: > ./add.so (compatibility version 0.0.0, current version 0.0.0) > /System/Library/Frameworks/Python.framework/Versions/2.7/Python > (compatibility version 2.7.0, current version 2.7.2) > /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version > 169.3.0) > > This seems odd because I am using the f2py executable included in Anaconda > 1.9.1. I can easily fix this problem by manually using install_name_tool > -change on the extension to link the correct library location, but this is > really cumbersome. Is there an alternative solution, such as an additional > command-line argument when invoking f2py? > This sounds like an issue specific to Anaconda, and you may get better support on the Anaconda support ML. David > > For what it is worth, I am also using Version 14.0.2 of the Intel Fortran > Compiler. > > Thanks, > Alex > -- > Alex Goodman > Graduate Research Assistant > Department of Atmospheric Science > Colorado State University > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Mar 27 17:02:27 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Mar 2014 21:02:27 +0000 Subject: [Numpy-discussion] f2py links extensions to incorrect python installation on OSX / Anaconda In-Reply-To: References: Message-ID: On Thu, Mar 27, 2014 at 8:50 PM, David Cournapeau wrote: > > On Thu, Mar 27, 2014 at 8:30 PM, Alex Goodman > wrote: >> >> Hi all, >> >> I have used f2py in the past on a Linux machine with virtually no issues. >> However on my Mac, I get the following error when importing an f2py >> generated extension: >> >> Fatal Python error: PyThreadState_Get: no current thread >> Abort trap: 6 >> >> After doing some research I found out that the extension is linked to the >> wrong python installation: >> otool -L add.so >> add.so: >> ./add.so (compatibility version 0.0.0, current version 0.0.0) >> /System/Library/Frameworks/Python.framework/Versions/2.7/Python >> (compatibility version 2.7.0, current version 2.7.2) >> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version >> 169.3.0) >> >> This seems odd because I am using the f2py executable included in Anaconda >> 1.9.1. I can easily fix this problem by manually using install_name_tool >> -change on the extension to link the correct library location, but this is >> really cumbersome. Is there an alternative solution, such as an additional >> command-line argument when invoking f2py? > > > This sounds like an issue specific to Anaconda, and you may get better > support on the Anaconda support ML. I think it's our bug. numpy.distutils adds an explicit `-framework Python` in the Intel Fortran link line. We should be just be using `-undefined dynamic_lookup`. https://github.com/numpy/numpy/blob/master/numpy/distutils/fcompiler/intel.py#L71 Alex, can you edit that file to remove the '-Wl,-framework,Python' from that list and try building again? -- Robert Kern From robert.kern at gmail.com Thu Mar 27 17:04:32 2014 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 27 Mar 2014 21:04:32 +0000 Subject: [Numpy-discussion] [SciPy-Dev] Windows wheels using MKL? In-Reply-To: References: Message-ID: On Thu, Mar 27, 2014 at 7:10 PM, Matthew Brett wrote: > Hi, > > On Thu, Mar 27, 2014 at 3:18 AM, Robert Kern wrote: >> It would be confusing to distribute these non-BSD wheels on the same >> PyPI page that declares most prominently that numpy is BSD-licensed. >> Adding some text elsewhere on the PyPI page is not going to help very >> much: people look at the "License: BSD" first and foremost. Nothing >> stops anyone else from building and distributing MKL-built binaries, a >> la C. Gohlke, but I don't think it is wise to do so on the PyPI page. > > Can you see any circumstances in which we could use the MKL binaries from pypi? No. Most of the point of adding binary wheels to PyPI would be to make `pip install numpy` work. That gives users *no* chance to see any documentation about the proprietary license of those binaries. -- Robert Kern From alex.goodman at colostate.edu Thu Mar 27 17:11:59 2014 From: alex.goodman at colostate.edu (Alex Goodman) Date: Thu, 27 Mar 2014 15:11:59 -0600 Subject: [Numpy-discussion] f2py links extensions to incorrect python installation on OSX / Anaconda In-Reply-To: References: Message-ID: Hi Robert, That did the trick, thanks! Alex On Thu, Mar 27, 2014 at 3:02 PM, Robert Kern wrote: > On Thu, Mar 27, 2014 at 8:50 PM, David Cournapeau > wrote: > > > > On Thu, Mar 27, 2014 at 8:30 PM, Alex Goodman < > alex.goodman at colostate.edu> > > wrote: > >> > >> Hi all, > >> > >> I have used f2py in the past on a Linux machine with virtually no > issues. > >> However on my Mac, I get the following error when importing an f2py > >> generated extension: > >> > >> Fatal Python error: PyThreadState_Get: no current thread > >> Abort trap: 6 > >> > >> After doing some research I found out that the extension is linked to > the > >> wrong python installation: > >> otool -L add.so > >> add.so: > >> ./add.so (compatibility version 0.0.0, current version 0.0.0) > >> /System/Library/Frameworks/Python.framework/Versions/2.7/Python > >> (compatibility version 2.7.0, current version 2.7.2) > >> /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version > >> 169.3.0) > >> > >> This seems odd because I am using the f2py executable included in > Anaconda > >> 1.9.1. I can easily fix this problem by manually using install_name_tool > >> -change on the extension to link the correct library location, but this > is > >> really cumbersome. Is there an alternative solution, such as an > additional > >> command-line argument when invoking f2py? > > > > > > This sounds like an issue specific to Anaconda, and you may get better > > support on the Anaconda support ML. > > I think it's our bug. numpy.distutils adds an explicit `-framework > Python` in the Intel Fortran link line. We should be just be using > `-undefined dynamic_lookup`. > > > https://github.com/numpy/numpy/blob/master/numpy/distutils/fcompiler/intel.py#L71 > > Alex, can you edit that file to remove the '-Wl,-framework,Python' > from that list and try building again? > > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Alex Goodman Graduate Research Assistant Department of Atmospheric Science Colorado State University -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoogendoorn.eelco at gmail.com Thu Mar 27 17:37:07 2014 From: hoogendoorn.eelco at gmail.com (Eelco Hoogendoorn) Date: Thu, 27 Mar 2014 22:37:07 +0100 Subject: [Numpy-discussion] Is there a pure numpy recipe for this? In-Reply-To: <201403272000.s2RK0tmY026171@blue-cove.com> References: <1395863332202-37077.post@n7.nabble.com> <1395865695422-37081.post@n7.nabble.com> <1395869031131-37087.post@n7.nabble.com> <1395903758783-37102.post@n7.nabble.com> <201403271442.s2REg37v025016@blue-cove.com> <201403272000.s2RK0tmY026171@blue-cove.com> Message-ID: Id recommend taking a look at pytables as well. It has support for out-of-core array computations on large arrays. On Thu, Mar 27, 2014 at 9:00 PM, RayS wrote: > Thanks for all of the suggestions; we are migrating to 64bit Python soon > as well. > The environments are Win7 and Mac Maverics. > carray sounds like what you said Chris - more I just found at > http://kmike.ru/python-data-structures/ > > - Ray Schumacher > > > > > At 12:31 PM 3/27/2014, you wrote: > > On Thu, Mar 27, 2014 at 7:42 AM, RayS wrote: > I find this interesting, since I work with medical data sets of 100s > of MB, and regularly run into memory allocation problems when doing a > lot of Fourrier analysis, waterfalls etc. The per-process limit seems > to be about 1.3GB on this 6GB quad-i7 with Win7. > > > This sounds like 32 bit -- have you tried a 64 bit Python_numpy? Nt that > you wont have issues anyway, but you should be abel to do better than > 1.3GB... > ? > ? memmaps are also limited to RAM, > > > I don't think so, no -- but are limited to 2GB (I think) ? if you're using > a 32 bit process > > There is also a compressed array package out there -- I can't remember > what it's called -- but if you have large? compressible? arrays -- that > might help. > ? > -CHB > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R ? ? ? ? ? ? (206) 526-6959? ? voice > 7600 Sand Point Way NE ? ? (206) 526-6329? ? fax > Seattle, WA ? 98115 ? ? ? ? (206) 526-6317? ? main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Thu Mar 27 18:02:56 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Thu, 27 Mar 2014 15:02:56 -0700 Subject: [Numpy-discussion] [SciPy-Dev] Windows wheels using MKL? In-Reply-To: References: Message-ID: On Thu, Mar 27, 2014 at 2:04 PM, Robert Kern wrote: > On Thu, Mar 27, 2014 at 7:10 PM, Matthew Brett wrote: >> Hi, >> >> On Thu, Mar 27, 2014 at 3:18 AM, Robert Kern wrote: > >>> It would be confusing to distribute these non-BSD wheels on the same >>> PyPI page that declares most prominently that numpy is BSD-licensed. >>> Adding some text elsewhere on the PyPI page is not going to help very >>> much: people look at the "License: BSD" first and foremost. Nothing >>> stops anyone else from building and distributing MKL-built binaries, a >>> la C. Gohlke, but I don't think it is wise to do so on the PyPI page. >> >> Can you see any circumstances in which we could use the MKL binaries from pypi? > > No. Most of the point of adding binary wheels to PyPI would be to make > `pip install numpy` work. That gives users *no* chance to see any > documentation about the proprietary license of those binaries. OK - fair enough. Does anyone disagree? If not, I suggest we remove MKL from the options we consider in the future. Cheers, Matthew From smudkavi at uwaterloo.ca Thu Mar 27 23:59:33 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Thu, 27 Mar 2014 23:59:33 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: Hi all, Apologies for the delay in following up, here is an expanded version of the proposal, which hopefully clears up most of the details. I have not included specific implementation details for the code, such as which functions to modify etc. since I think those are not traditionally included in NEPs? Please find attached the expanded proposal, and the rendered version is available here: https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst I look forward to comments, agreements/disagreements with this (and clarification if this needs even further expansion). Please find attached the On Mar 24, 2014, at 12:39 AM, Chris Barker wrote: > On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith wrote: > On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker wrote: > > * I think there are more or less three options: > > 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always > > b) don't have any timezone handling at all -- all datetime64s are naive > > (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) > > 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) > > 3) Full on proper TZ handling. > > > > I think (3) is off the table for now. > > I think the first goal is to define what a plain vanilla datetime64 > does, without any extra attributes. This is for two practical reasons: > First, our overriding #1 goal is to fix the nasty I/O problems that > default datetime64's show, so until that's done any other bells and > whistles are a distraction. And second, adding parameters to dtypes > right now is technically messy. > > This rules out (2) and (3). > > yup -- though I'm not sure I agree that we need to do this, if we are going to do something more later anyway. But you have a key point - maybe the dtype system simply isn't ready to do it right, and then it may be better not to try. > > In which case, we are down to naive or always UTC -- and again, those really aren't very different. Though I prefer naive -- always UTC adds some complication if you don't actually want UTC, and I'm not sure it actually buys us anything. And maybe it's jsut me, but all my code would need to use naive, so I"d be doing a bit of working around to use a UTC-always system. > > If we additionally want to keep the option of adding a timezone > parameter later, and have the result end up looking like stdlib > datetime, then I think 1(b) is the obvious choice. My guess is that > this is also what's most compatible with pandas, which is currently > keeping its own timezone object outside of the dtype. > > Good point, all else being equal, compatability with Pandas would be a good thing. > > Any downsides? I guess this would mean that we start raising an error > on ISO 8601's with offsets attached, which might annoy some people? > > yes, but errors are better than incorrect values... > > > Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. > > Please no! An integer offset is a terrible way to represent timezones, > > well, it would solve the being able to read ISO strings problem, and being able to perform operations with datetimes in multiple time zones. though I guess you could get most of that with UTC-always. > > and hardcoding this would just get in the way of a proper solution. > > well, that's a point -- if we think there is any hope of a proper solution down the road, then yes, it would be better not to make that harder. > > -Chris > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Sankarshan Mudkavi Undergraduate in Physics, University of Waterloo www.smudkavi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: datetime-improvement-proposal.rst Type: application/octet-stream Size: 4913 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From njs at pobox.com Fri Mar 28 05:17:24 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 10:17:24 +0100 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: On 28 Mar 2014 05:00, "Sankarshan Mudkavi" wrote: > > Hi all, > > Apologies for the delay in following up, here is an expanded version of the proposal, which hopefully clears up most of the details. I have not included specific implementation details for the code, such as which functions to modify etc. since I think those are not traditionally included in NEPs? The format seems fine to me. Really the point is just to have a document that we can use as reference when deciding on behaviour, and this does that :-). Three quick comments: 1- You give as an example of "naive" datetime handling: >>> np.datetime64('2005-02-25T03:00Z') np.datetime64('2005-02-25T03:00') This IIUC is incorrect. The Z modifier is a timezone offset, and for normal "naive" datetimes would cause an error. 2- It would be good to include explicitly examples of conversion to and from datetimes alongside the examples of conversions to and from strings. 3- It would be good to (eventually) include some discussion of the impact of the preferred proposal on existing code. E.g., will this break a lot of people's pipelines? (Are people currently *always* adding timezones to their numpy input to avoid the problem, and now will have to switch to the opposite behaviour depending on numpy version?) And we'll want to make sure to get feedback from the pydata@ (pandas) list explicitly, though that can wait until people here have had a chance to respond to the first draft. Thanks for pushing this forward! -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Mar 28 07:51:27 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 04:51:27 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> Message-ID: Hi, On Wed, Mar 26, 2014 at 1:41 PM, Nathaniel Smith wrote: > On Wed, Mar 26, 2014 at 7:34 PM, Julian Taylor > wrote: >> as for using openblas by default in binary builds, no. >> pthread openblas build is now fork safe which is great but it is still >> not reliable enough for a default. >> E.g. the current latest release 0.2.8 still has one crash bug on >> dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. >> git head has the former four fixed bug still has wrong results for cgemv. >> The not so old 0.2.8 also fixed whole bunch more crashes and wrong >> result issues (crashes on QR, uninitialized data use in dgemm, ...). >> None of the fixes received unit tests, so I am somewhat pessimistic that >> it will improve, especially as the maintainer is dissertating (is that >> the right word?) and most of the code is assembler code only few people >> can write (it is simply not required anymore, we have code generators >> and intrinsics for that). >> >> Openblas is great if you do not have the patience to build ATLAS and >> only use a restricted set of functionality and platforms you can easily >> test. >> Currently it is in my opinion not suitable for a general purpose library >> like numpy. > > Those problems you list are pretty damning, but neither is it > reasonable to expect everyone to manually build ATLAS on every machine > they use (or their students use, or...) :-(. So what other options do > we have for general purpose builds? Give up and use MKL? How's > eigen-blas doing these days? (I guess from skimming their docs they > use OpenMP?) I see it should be possible to build a full blas and partial lapack library with eigen [1] [2]. Does anyone know how their performance compares to MKL or the reference implementations? Carl - have you tried building eigen with your custom tool chain? Cheers, Matthew [1] http://eigen.tuxfamily.org/index.php?title=3.0 [2] http://stackoverflow.com/questions/20441851/build-numpy-with-eigen-instead-of-atlas-or-openblas From sturla.molden at gmail.com Fri Mar 28 10:54:45 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 14:54:45 +0000 (UTC) Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe References: <53331DBF.6020504@googlemail.com> Message-ID: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Matthew Brett wrote: > I see it should be possible to build a full blas and partial lapack > library with eigen [1] [2]. Eigen has a licensing issue as well, unfortunately, MPL2. E.g. it requires recipients to be informed of the MPL requirements (cf. impossible with pip install numpy). Sturla From alan.isaac at gmail.com Fri Mar 28 11:31:50 2014 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 28 Mar 2014 11:31:50 -0400 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: <533595E6.30300@gmail.com> On 3/28/2014 10:54 AM, Sturla Molden wrote: > Eigen has a licensing issue as well, unfortunately, MPL2. > > E.g. it requires recipients to be informed of the MPL requirements (cf. > impossible with pip install numpy). Eigen chose MPL2 with the intent that Eigen be usable by "all projects". http://eigen.tuxfamily.org/index.php?title=Licensing_FAQ If you are correct in your interpretation, it may be worth raising the issue and requesting the needed accommodation. Alan From robert.kern at gmail.com Fri Mar 28 11:48:58 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 15:48:58 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <533595E6.30300@gmail.com> References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <533595E6.30300@gmail.com> Message-ID: On Fri, Mar 28, 2014 at 3:31 PM, Alan G Isaac wrote: > On 3/28/2014 10:54 AM, Sturla Molden wrote: >> Eigen has a licensing issue as well, unfortunately, MPL2. >> >> E.g. it requires recipients to be informed of the MPL requirements (cf. >> impossible with pip install numpy). > > Eigen chose MPL2 with the intent that Eigen be usable by > "all projects". > http://eigen.tuxfamily.org/index.php?title=Licensing_FAQ > If you are correct in your interpretation, it may be worth > raising the issue and requesting the needed accommodation. The authors of Eigen are familiar with our policy on this matter. See the thread following this email: http://mail.scipy.org/pipermail/numpy-discussion/2010-January/047958.html The change from LGPL to MPL2 isn't relevant to our policy. Both have more restrictions and conditions than the BSD license. -- Robert Kern From robert.kern at gmail.com Fri Mar 28 11:58:15 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 15:58:15 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: On Fri, Mar 28, 2014 at 2:54 PM, Sturla Molden wrote: > Matthew Brett wrote: > >> I see it should be possible to build a full blas and partial lapack >> library with eigen [1] [2]. > > Eigen has a licensing issue as well, unfortunately, MPL2. > > E.g. it requires recipients to be informed of the MPL requirements (cf. > impossible with pip install numpy). That's not the relevant condition. That's easily taken care of by including the MPL2 license text in the binary alongside numpy's BSD license text. This is no different than numpy's BSD license itself, which requires that the license text be included. It's not like people can't distribute any MPL2 project on PyPI just because pip doesn't print out the license before installing. The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2. -- Robert Kern From njs at pobox.com Fri Mar 28 14:43:00 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 19:43:00 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: On Fri, Mar 28, 2014 at 4:58 PM, Robert Kern wrote: > On Fri, Mar 28, 2014 at 2:54 PM, Sturla Molden wrote: >> Matthew Brett wrote: >> >>> I see it should be possible to build a full blas and partial lapack >>> library with eigen [1] [2]. >> >> Eigen has a licensing issue as well, unfortunately, MPL2. >> >> E.g. it requires recipients to be informed of the MPL requirements (cf. >> impossible with pip install numpy). > > That's not the relevant condition. That's easily taken care of by > including the MPL2 license text in the binary alongside numpy's BSD > license text. This is no different than numpy's BSD license itself, > which requires that the license text be included. It's not like people > can't distribute any MPL2 project on PyPI just because pip doesn't > print out the license before installing. > > The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2. Those requirements just say that in addition to including the MPL2 license text, we also have to include a notice saying where the source code is available, i.e. the package would have to somewhere include a link to eigen.org. https://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries I'm not sure why this would be a problem. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From matthew.brett at gmail.com Fri Mar 28 14:49:12 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 11:49:12 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi, On Fri, Mar 28, 2014 at 8:58 AM, Robert Kern wrote: > On Fri, Mar 28, 2014 at 2:54 PM, Sturla Molden wrote: >> Matthew Brett wrote: >> >>> I see it should be possible to build a full blas and partial lapack >>> library with eigen [1] [2]. >> >> Eigen has a licensing issue as well, unfortunately, MPL2. >> >> E.g. it requires recipients to be informed of the MPL requirements (cf. >> impossible with pip install numpy). > > That's not the relevant condition. That's easily taken care of by > including the MPL2 license text in the binary alongside numpy's BSD > license text. This is no different than numpy's BSD license itself, > which requires that the license text be included. It's not like people > can't distribute any MPL2 project on PyPI just because pip doesn't > print out the license before installing. > > The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2. Thanks for thinking this through. If I read you right, your opinion is that there would be no problem including Eigen binaries with Numpy. License here: http://www.mozilla.org/MPL/2.0/ Section 3.1 - if we distribute Eigen source, it has to be under the MPL; so that doesn't apply to binaries. Section 3.2 a) says that if we distribute binaries, we have to point to the original source (e.g. link on pypi page) Section 3.2 b) says we can distribute the executable code under any license as long as it doesn't restrict the user's access to Eigen source. I think that means there is no problem distributing the binaries under the BSD license. Nathaniel's link is the relevant one : http://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries So - is Eigen our best option for optimized blas / lapack binaries on 64 bit Windows? Cheers, Matthew From sturla.molden at gmail.com Fri Mar 28 14:56:09 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 18:56:09 +0000 (UTC) Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe References: <53331DBF.6020504@googlemail.com> Message-ID: <1168130578417724894.234999sturla.molden-gmail.com@news.gmane.org> Matthew Brett wrote: > Does anyone know how their performance compares to MKL or the > reference implementations? http://eigen.tuxfamily.org/index.php?title=Benchmark http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html Sturla From sturla.molden at gmail.com Fri Mar 28 15:01:03 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 19:01:03 +0000 (UTC) Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: <1580568601417725969.631584sturla.molden-gmail.com@news.gmane.org> Matthew Brett wrote: > So - is Eigen our best option for optimized blas / lapack binaries on > 64 bit Windows? Maybe not: http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html With AVX the difference is possibly even larger. Sturla From njs at pobox.com Fri Mar 28 15:23:34 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 20:23:34 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <1580568601417725969.631584sturla.molden-gmail.com@news.gmane.org> References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1580568601417725969.631584sturla.molden-gmail.com@news.gmane.org> Message-ID: On Fri, Mar 28, 2014 at 8:01 PM, Sturla Molden wrote: > Matthew Brett wrote: > >> So - is Eigen our best option for optimized blas / lapack binaries on >> 64 bit Windows? > > Maybe not: > > http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html > > With AVX the difference is possibly even larger. But if we rule out closed-source BLAS, and we rule out OpenBLAS because of our distrusting its accuracy, and we aren't going to recompile ATLAS on every machine, then Eigen is the only library they tested that is even an option for us. It would be nice to see some comparison between our actual options -- Eigen, generically compiled ATLAS, anything else? -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From robert.kern at gmail.com Fri Mar 28 15:26:02 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 19:26:02 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: It's only a problem in that the binary will not be BSD, and we do need to communicate that appropriately. It will contain a significant component that is MPL2 licensed. The terms that force us to include the link to the Eigen source that we used forces downstream redistributors of the binary to do the same. Now, of all the copyleft licenses, this is certainly the most friendly, but it is not BSD. On Mar 28, 2014 6:43 PM, "Nathaniel Smith" wrote: > On Fri, Mar 28, 2014 at 4:58 PM, Robert Kern > wrote: > > On Fri, Mar 28, 2014 at 2:54 PM, Sturla Molden > wrote: > >> Matthew Brett wrote: > >> > >>> I see it should be possible to build a full blas and partial lapack > >>> library with eigen [1] [2]. > >> > >> Eigen has a licensing issue as well, unfortunately, MPL2. > >> > >> E.g. it requires recipients to be informed of the MPL requirements (cf. > >> impossible with pip install numpy). > > > > That's not the relevant condition. That's easily taken care of by > > including the MPL2 license text in the binary alongside numpy's BSD > > license text. This is no different than numpy's BSD license itself, > > which requires that the license text be included. It's not like people > > can't distribute any MPL2 project on PyPI just because pip doesn't > > print out the license before installing. > > > > The extra-BSD conditions of the MPL2 are sections 3.1 and 3.2. > > Those requirements just say that in addition to including the MPL2 > license text, we also have to include a notice saying where the source > code is available, i.e. the package would have to somewhere include a > link to eigen.org. > > https://www.mozilla.org/MPL/2.0/FAQ.html#distribute-my-binaries > > I'm not sure why this would be a problem. > > -n > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Mar 28 15:28:04 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 12:28:04 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <1168130578417724894.234999sturla.molden-gmail.com@news.gmane.org> References: <53331DBF.6020504@googlemail.com> <1168130578417724894.234999sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi, On Fri, Mar 28, 2014 at 11:56 AM, Sturla Molden wrote: > Matthew Brett wrote: > >> Does anyone know how their performance compares to MKL or the >> reference implementations? > > http://eigen.tuxfamily.org/index.php?title=Benchmark I don't know how relevant these are to our case. If I understand correctly, the usual use of Eigen, as in these benchmarks, is to use the Eigen headers to get fast code via C++ templating. Because they know some of us need this, Eigen can also build a more standard blas / lapack library to link against, but I presume this will stop Eigen templating doing lots of clever tricks with the operations, and therefore slow it down. Happy to be corrected though. > http://gcdart.blogspot.de/2013/06/fast-matrix-multiply-and-ml.html I think this page does not use the Eigen blas libraries either [1] Also - this is on a massive linux machine ("48 core and 66GB RAM"). He's done a great job of showing what he did though. The problem for us is: We can't use MKL, ACML [2] atlas is very difficult to compile on 64 bit windows, and has some technical limitations on 64 bit [3] So I think we're down to openblas and eigen for 64-bit windows. Does anyone disagree? Cheers, Matthew [1] : https://github.com/gcdart/dense-matrix-mult/blob/master/EIGEN/compile_eigen.sh [2] : http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/ACML_June_24_2010_v2.pdf [3] : http://math-atlas.sourceforge.net/atlas_install/node57.html From matthew.brett at gmail.com Fri Mar 28 15:32:31 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 12:32:31 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi, On Fri, Mar 28, 2014 at 12:26 PM, Robert Kern wrote: > It's only a problem in that the binary will not be BSD, and we do need to > communicate that appropriately. It will contain a significant component that > is MPL2 licensed. The terms that force us to include the link to the Eigen > source that we used forces downstream redistributors of the binary to do the > same. Now, of all the copyleft licenses, this is certainly the most > friendly, but it is not BSD. I think the binary would be BSD because of section 3.2: "You may distribute such Executable Form under the terms of this License, or sublicense it under different terms, provided that the license for the Executable Form does not attempt to limit or alter the recipients' rights in the Source Code Form under this License." I think this is specifically saying - as long as our license (BSD) does not try and limit access to Eigen source, we can distribute our binary under our license. Cheers, Matthew From njs at pobox.com Fri Mar 28 15:34:16 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 20:34:16 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: On 28 Mar 2014 20:26, "Robert Kern" wrote: > > It's only a problem in that the binary will not be BSD, and we do need to communicate that appropriately. It will contain a significant component that is MPL2 licensed. The terms that force us to include the link to the Eigen source that we used forces downstream redistributors of the binary to do the same. Now, of all the copyleft licenses, this is certainly the most friendly, but it is not BSD. AFAICT, the only way redistributers could violate the MPL would be if they unpacked our binary and deleted the license file. But this would also be a violation of the BSD. The only difference in terms of requirements on redistributors between MPL and BSD seems to be exactly *which* text you include in your license file. I don't know if Eigen is a good choice on technical grounds (or even a possible one - has anyone ever actually compiled numpy against it?), but this license thing just doesn't seem like an important issue to me, if the alternative is not providing useful binaries. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 28 15:37:34 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 19:37:34 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: The BSD license alters the recipient's rights. BSD binaries can be redistributed without pointing to the sources. On Mar 28, 2014 7:33 PM, "Matthew Brett" wrote: > Hi, > > On Fri, Mar 28, 2014 at 12:26 PM, Robert Kern > wrote: > > It's only a problem in that the binary will not be BSD, and we do need to > > communicate that appropriately. It will contain a significant component > that > > is MPL2 licensed. The terms that force us to include the link to the > Eigen > > source that we used forces downstream redistributors of the binary to do > the > > same. Now, of all the copyleft licenses, this is certainly the most > > friendly, but it is not BSD. > > I think the binary would be BSD because of section 3.2: > > "You may distribute such Executable Form under the terms of this > License, or sublicense it under different terms, provided that the > license for the Executable Form does not attempt to limit or alter the > recipients' rights in the Source Code Form under this License." > > I think this is specifically saying - as long as our license (BSD) > does not try and limit access to Eigen source, we can distribute our > binary under our license. > > Cheers, > > Matthew > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 28 15:40:06 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 19:40:06 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: No, the license does not contain a pointer to the Eigen sources, which is required. https://bitbucket.org/eigen/eigen/src/fabd880592ac3343713cc07e7287098afd0f18ca/COPYING.MPL2?at=default On Mar 28, 2014 7:34 PM, "Nathaniel Smith" wrote: > On 28 Mar 2014 20:26, "Robert Kern" wrote: > > > > It's only a problem in that the binary will not be BSD, and we do need > to communicate that appropriately. It will contain a significant component > that is MPL2 licensed. The terms that force us to include the link to the > Eigen source that we used forces downstream redistributors of the binary to > do the same. Now, of all the copyleft licenses, this is certainly the most > friendly, but it is not BSD. > > AFAICT, the only way redistributers could violate the MPL would be if they > unpacked our binary and deleted the license file. But this would also be a > violation of the BSD. The only difference in terms of requirements on > redistributors between MPL and BSD seems to be exactly *which* text you > include in your license file. > > I don't know if Eigen is a good choice on technical grounds (or even a > possible one - has anyone ever actually compiled numpy against it?), but > this license thing just doesn't seem like an important issue to me, if the > alternative is not providing useful binaries. > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrea.gavana at gmail.com Fri Mar 28 15:43:10 2014 From: andrea.gavana at gmail.com (Andrea Gavana) Date: Fri, 28 Mar 2014 20:43:10 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <1168130578417724894.234999sturla.molden-gmail.com@news.gmane.org> References: <53331DBF.6020504@googlemail.com> <1168130578417724894.234999sturla.molden-gmail.com@news.gmane.org> Message-ID: On 28 March 2014 19:56, Sturla Molden wrote: > Matthew Brett wrote: > > > Does anyone know how their performance compares to MKL or the > > reference implementations? > > http://eigen.tuxfamily.org/index.php?title=Benchmark Very, very funny and twisted approach to legend-ordering-in-a-plot approach. Maybe someone more knowledgeable can explain the ordering of the labels in the plot legends, as after a while they don't make any sense - neither lexicographically nor performance-wise. Andrea. "Imagination Is The Only Weapon In The War Against Reality." http://www.infinity77.net # ------------------------------------------------------------- # def ask_mailing_list_support(email): if mention_platform_and_version() and include_sample_app(): send_message(email) else: install_malware() erase_hard_drives() # ------------------------------------------------------------- # -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 28 15:57:30 2014 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 28 Mar 2014 19:57:30 +0000 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: Of course, that's besides the point. Yes, pretty much everyone that likes the BSD license of numpy will be okay with the minimal burdens the MPL2 lays on them. The problem is that we need to properly communicate that license. The PyPI page is not adequate to that task, in my opinion. I have no problem with the project distributing such binaries anywhere else. But then, I have no problem with the project distributing MKL binaries elsewhere either. On Mar 28, 2014 7:34 PM, "Nathaniel Smith" wrote: > On 28 Mar 2014 20:26, "Robert Kern" wrote: > > > > It's only a problem in that the binary will not be BSD, and we do need > to communicate that appropriately. It will contain a significant component > that is MPL2 licensed. The terms that force us to include the link to the > Eigen source that we used forces downstream redistributors of the binary to > do the same. Now, of all the copyleft licenses, this is certainly the most > friendly, but it is not BSD. > > AFAICT, the only way redistributers could violate the MPL would be if they > unpacked our binary and deleted the license file. But this would also be a > violation of the BSD. The only difference in terms of requirements on > redistributors between MPL and BSD seems to be exactly *which* text you > include in your license file. > > I don't know if Eigen is a good choice on technical grounds (or even a > possible one - has anyone ever actually compiled numpy against it?), but > this license thing just doesn't seem like an important issue to me, if the > alternative is not providing useful binaries. > > -n > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 28 16:11:45 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 21:11:45 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: Yes, because they're distributing source. But *our* license file could contain the text of the BSD, the text of the MPL, and the text "Eigen source is available at http://eigen.org." If the only problem with eigen turns out to be that we have to add a line of text to a file then I think we can probably manage this somehow. -n On 28 Mar 2014 20:40, "Robert Kern" wrote: > No, the license does not contain a pointer to the Eigen sources, which is > required. > > > https://bitbucket.org/eigen/eigen/src/fabd880592ac3343713cc07e7287098afd0f18ca/COPYING.MPL2?at=default > On Mar 28, 2014 7:34 PM, "Nathaniel Smith" wrote: > >> On 28 Mar 2014 20:26, "Robert Kern" wrote: >> > >> > It's only a problem in that the binary will not be BSD, and we do need >> to communicate that appropriately. It will contain a significant component >> that is MPL2 licensed. The terms that force us to include the link to the >> Eigen source that we used forces downstream redistributors of the binary to >> do the same. Now, of all the copyleft licenses, this is certainly the most >> friendly, but it is not BSD. >> >> AFAICT, the only way redistributers could violate the MPL would be if >> they unpacked our binary and deleted the license file. But this would also >> be a violation of the BSD. The only difference in terms of requirements on >> redistributors between MPL and BSD seems to be exactly *which* text you >> include in your license file. >> >> I don't know if Eigen is a good choice on technical grounds (or even a >> possible one - has anyone ever actually compiled numpy against it?), but >> this license thing just doesn't seem like an important issue to me, if the >> alternative is not providing useful binaries. >> >> -n >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Fri Mar 28 16:28:09 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 20:28:09 +0000 (UTC) Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > If the only problem with eigen turns out to be that we have to add a line > of text to a file then I think we can probably manage this somehow. We would also have to compile Eigen-BLAS for various architectures and CPU counts. It is not "adaptive" like MKL or OpenBLAS. Sturla From smudkavi at uwaterloo.ca Fri Mar 28 16:30:54 2014 From: smudkavi at uwaterloo.ca (Sankarshan Mudkavi) Date: Fri, 28 Mar 2014 16:30:54 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> Message-ID: <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Hi Nathaniel, > 1- You give as an example of "naive" datetime handling: > > >>> np.datetime64('2005-02-25T03:00Z') > np.datetime64('2005-02-25T03:00') > > This IIUC is incorrect. The Z modifier is a timezone offset, and for normal "naive" datetimes would cause an error. > If what I understand from reading: http://thread.gmane.org/gmane.comp.python.numeric.general/53805 It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment would raise an error, and those specific conditions would not (I'm guessing this is because we assume it's UTC (or the same timezone) internally, anything that explicitly tells us it is UTC is acceptable, although that may be just my misreading of it.) However on output we don't use the Z modifier (which is why it's different from the UTC datetime64). I will change it to return an error if what I thought is incorrect and also include examples of conversion from datetimes as you requested. Please let me know if there are any more changes that are required! I look forward to further comments/questions. Cheers, Sankarshan > On Fri, Mar 28, 2014 at 5:17 AM, Nathaniel Smith wrote: > On 28 Mar 2014 05:00, "Sankarshan Mudkavi" wrote: > > > > Hi all, > > > > Apologies for the delay in following up, here is an expanded version of the proposal, which hopefully clears up most of the details. I have not included specific implementation details for the code, such as which functions to modify etc. since I think those are not traditionally included in NEPs? > > The format seems fine to me. Really the point is just to have a document that we can use as reference when deciding on behaviour, and this does that :-). > > Three quick comments: > > 1- You give as an example of "naive" datetime handling: > > >>> np.datetime64('2005-02-25T03:00Z') > np.datetime64('2005-02-25T03:00') > > This IIUC is incorrect. The Z modifier is a timezone offset, and for normal "naive" datetimes would cause an error. > > 2- It would be good to include explicitly examples of conversion to and from datetimes alongside the examples of conversions to and from strings. > > 3- It would be good to (eventually) include some discussion of the impact of the preferred proposal on existing code. E.g., will this break a lot of people's pipelines? (Are people currently *always* adding timezones to their numpy input to avoid the problem, and now will have to switch to the opposite behaviour depending on numpy version?) And we'll want to make sure to get feedback from the pydata@ (pandas) list explicitly, though that can wait until people here have had a chance to respond to the first draft. > > Thanks for pushing this forward! > -n > >> Hi all, >> >> Apologies for the delay in following up, here is an expanded version of the proposal, which hopefully clears up most of the details. I have not included specific implementation details for the code, such as which functions to modify etc. since I think those are not traditionally included in NEPs? >> >> Please find attached the expanded proposal, and the rendered version is available here: >> https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst >> >> >> >> I look forward to comments, agreements/disagreements with this (and clarification if this needs even further expansion). >> >> >> Please find attached the >> On Mar 24, 2014, at 12:39 AM, Chris Barker wrote: >> >>> On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith wrote: >>> On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker wrote: >>> > * I think there are more or less three options: >>> > 1) a) don't have any timezone handling at all -- all datetime64s are UTC. Always >>> > b) don't have any timezone handling at all -- all datetime64s are naive >>> > (the only difference between these two is I/O of strings, and maybe I/O of datetime objects with a time zone) >>> > 2) Have a time zone associated with the array -- defaulting to either UTC or None, but don't provide any implementation other than the tagging, with the ability to add in TZ handler if you want (can this be done efficiently?) >>> > 3) Full on proper TZ handling. >>> > >>> > I think (3) is off the table for now. >>> >>> I think the first goal is to define what a plain vanilla datetime64 >>> does, without any extra attributes. This is for two practical reasons: >>> First, our overriding #1 goal is to fix the nasty I/O problems that >>> default datetime64's show, so until that's done any other bells and >>> whistles are a distraction. And second, adding parameters to dtypes >>> right now is technically messy. >>> >>> This rules out (2) and (3). >>> >>> yup -- though I'm not sure I agree that we need to do this, if we are going to do something more later anyway. But you have a key point - maybe the dtype system simply isn't ready to do it right, and then it may be better not to try. >>> >>> In which case, we are down to naive or always UTC -- and again, those really aren't very different. Though I prefer naive -- always UTC adds some complication if you don't actually want UTC, and I'm not sure it actually buys us anything. And maybe it's jsut me, but all my code would need to use naive, so I"d be doing a bit of working around to use a UTC-always system. >>> >>> If we additionally want to keep the option of adding a timezone >>> parameter later, and have the result end up looking like stdlib >>> datetime, then I think 1(b) is the obvious choice. My guess is that >>> this is also what's most compatible with pandas, which is currently >>> keeping its own timezone object outside of the dtype. >>> >>> Good point, all else being equal, compatability with Pandas would be a good thing. >>> >>> Any downsides? I guess this would mean that we start raising an error >>> on ISO 8601's with offsets attached, which might annoy some people? >>> >>> yes, but errors are better than incorrect values... >>> >>> > Writing this made me think of a third option -- tracking, but no real manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it does is note an offset. A given DateTime64 array would have a given offset assigned to it, and the appropriate addition and subtraction would happen at I/O. Offset of 0.00 would be UTC, and there would be a None option for naive. >>> >>> Please no! An integer offset is a terrible way to represent timezones, >>> >>> well, it would solve the being able to read ISO strings problem, and being able to perform operations with datetimes in multiple time zones. though I guess you could get most of that with UTC-always. >>> >>> and hardcoding this would just get in the way of a proper solution. >>> >>> well, that's a point -- if we think there is any hope of a proper solution down the road, then yes, it would be better not to make that harder. >>> >>> -Chris >>> >>> -- >>> >>> Christopher Barker, Ph.D. >>> Oceanographer >>> >>> Emergency Response Division >>> NOAA/NOS/OR&R (206) 526-6959 voice >>> 7600 Sand Point Way NE (206) 526-6329 fax >>> Seattle, WA 98115 (206) 526-6317 main reception >>> >>> Chris.Barker at noaa.gov >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> -- >> Sankarshan Mudkavi >> Undergraduate in Physics, University of Waterloo >> www.smudkavi.com >> >> >> >> >> >> > -- Sankarshan Mudkavi Undergraduate in Physics, University of Waterloo www.smudkavi.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From jeffreback at gmail.com Fri Mar 28 16:39:36 2014 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 28 Mar 2014 16:39:36 -0400 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: FYI Here are docs for panda of timezone handling wesm worked thru the various issues w.r.t. conversion, localization, and ambiguous zone crossing. http://pandas.pydata.org/pandas-docs/stable/timeseries.html#time-zone-handling implementation is largely in here: (underlying impl is a datetime64[ns] dtype with a pytz as the timezone) https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py On Fri, Mar 28, 2014 at 4:30 PM, Sankarshan Mudkavi wrote: > > Hi Nathaniel, > > 1- You give as an example of "naive" datetime handling: > > >>> np.datetime64('2005-02-25T03:00Z') > np.datetime64('2005-02-25T03:00') > > This IIUC is incorrect. The Z modifier is a timezone offset, and for > normal "naive" datetimes would cause an error. > > > If what I understand from reading: > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 > > It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment > would raise an error, and those specific conditions would not (I'm guessing > this is because we assume it's UTC (or the same timezone) internally, > anything that explicitly tells us it is UTC is acceptable, although that > may be just my misreading of it.) > > However on output we don't use the Z modifier (which is why it's different > from the UTC datetime64). > > I will change it to return an error if what I thought is incorrect and > also include examples of conversion from datetimes as you requested. > > Please let me know if there are any more changes that are required! I look > forward to further comments/questions. > > Cheers, > Sankarshan > > On Fri, Mar 28, 2014 at 5:17 AM, Nathaniel Smith wrote: > > On 28 Mar 2014 05:00, "Sankarshan Mudkavi" wrote: > > > > Hi all, > > > > Apologies for the delay in following up, here is an expanded version of > the proposal, which hopefully clears up most of the details. I have not > included specific implementation details for the code, such as which > functions to modify etc. since I think those are not traditionally included > in NEPs? > > The format seems fine to me. Really the point is just to have a document > that we can use as reference when deciding on behaviour, and this does that > :-). > > Three quick comments: > > 1- You give as an example of "naive" datetime handling: > > >>> np.datetime64('2005-02-25T03:00Z') > np.datetime64('2005-02-25T03:00') > > This IIUC is incorrect. The Z modifier is a timezone offset, and for > normal "naive" datetimes would cause an error. > > 2- It would be good to include explicitly examples of conversion to and > from datetimes alongside the examples of conversions to and from strings. > > 3- It would be good to (eventually) include some discussion of the impact > of the preferred proposal on existing code. E.g., will this break a lot of > people's pipelines? (Are people currently *always* adding timezones to > their numpy input to avoid the problem, and now will have to switch to the > opposite behaviour depending on numpy version?) And we'll want to make sure > to get feedback from the pydata@ (pandas) list explicitly, though that > can wait until people here have had a chance to respond to the first draft. > > Thanks for pushing this forward! > -n > > Hi all, > > Apologies for the delay in following up, here is an expanded version of > the proposal, which hopefully clears up most of the details. I have not > included specific implementation details for the code, such as which > functions to modify etc. since I think those are not traditionally included > in NEPs? > > Please find attached the expanded proposal, and the rendered version is > available here: > > https://github.com/Sankarshan-Mudkavi/numpy/blob/Enhance-datetime64/doc/neps/datetime-improvement-proposal.rst > > > > I look forward to comments, agreements/disagreements with this (and > clarification if this needs even further expansion). > > > Please find attached the > On Mar 24, 2014, at 12:39 AM, Chris Barker wrote: > > On Fri, Mar 21, 2014 at 3:43 PM, Nathaniel Smith wrote: > >> On Thu, Mar 20, 2014 at 11:27 PM, Chris Barker >> wrote: >> > * I think there are more or less three options: >> > 1) a) don't have any timezone handling at all -- all datetime64s >> are UTC. Always >> > b) don't have any timezone handling at all -- all datetime64s >> are naive >> > (the only difference between these two is I/O of strings, >> and maybe I/O of datetime objects with a time zone) >> > 2) Have a time zone associated with the array -- defaulting to >> either UTC or None, but don't provide any implementation other than the >> tagging, with the ability to add in TZ handler if you want (can this be >> done efficiently?) >> > 3) Full on proper TZ handling. >> > >> > I think (3) is off the table for now. >> >> I think the first goal is to define what a plain vanilla datetime64 >> does, without any extra attributes. This is for two practical reasons: >> First, our overriding #1 goal is to fix the nasty I/O problems that >> default datetime64's show, so until that's done any other bells and >> whistles are a distraction. And second, adding parameters to dtypes >> right now is technically messy. >> >> This rules out (2) and (3). >> > > yup -- though I'm not sure I agree that we need to do this, if we are > going to do something more later anyway. But you have a key point - maybe > the dtype system simply isn't ready to do it right, and then it may be > better not to try. > > In which case, we are down to naive or always UTC -- and again, those > really aren't very different. Though I prefer naive -- always UTC adds some > complication if you don't actually want UTC, and I'm not sure it actually > buys us anything. And maybe it's jsut me, but all my code would need to use > naive, so I"d be doing a bit of working around to use a UTC-always system. > > >> If we additionally want to keep the option of adding a timezone >> parameter later, and have the result end up looking like stdlib >> datetime, then I think 1(b) is the obvious choice. My guess is that >> this is also what's most compatible with pandas, which is currently >> keeping its own timezone object outside of the dtype. >> > > Good point, all else being equal, compatability with Pandas would be a > good thing. > > Any downsides? I guess this would mean that we start raising an error >> on ISO 8601's with offsets attached, which might annoy some people? >> > > yes, but errors are better than incorrect values... > > > Writing this made me think of a third option -- tracking, but no real > manipulation, of TZ. This would be analogous to the ISO 8601 does -- all it > does is note an offset. A given DateTime64 array would have a given offset > assigned to it, and the appropriate addition and subtraction would happen > at I/O. Offset of 0.00 would be UTC, and there would be a None option for > naive. > > Please no! An integer offset is a terrible way to represent timezones, >> > > well, it would solve the being able to read ISO strings problem, and being > able to perform operations with datetimes in multiple time zones. though I > guess you could get most of that with UTC-always. > > >> and hardcoding this would just get in the way of a proper solution. >> > > well, that's a point -- if we think there is any hope of a proper solution > down the road, then yes, it would be better not to make that harder. > > -Chris > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > -- > Sankarshan Mudkavi > Undergraduate in Physics, University of Waterloo > www.smudkavi.com > > > > > > > > -- > Sankarshan Mudkavi > Undergraduate in Physics, University of Waterloo > www.smudkavi.com > > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Mar 28 16:59:55 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 13:59:55 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <53331DBF.6020504@googlemail.com> <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi, On Fri, Mar 28, 2014 at 12:57 PM, Robert Kern wrote: > Of course, that's besides the point. Yes, pretty much everyone that likes > the BSD license of numpy will be okay with the minimal burdens the MPL2 lays > on them. The problem is that we need to properly communicate that license. > The PyPI page is not adequate to that task, in my opinion. I have no problem > with the project distributing such binaries anywhere else. But then, I have > no problem with the project distributing MKL binaries elsewhere either. > > On Mar 28, 2014 7:34 PM, "Nathaniel Smith" wrote: >> >> On 28 Mar 2014 20:26, "Robert Kern" wrote: >> > >> > It's only a problem in that the binary will not be BSD, and we do need >> > to communicate that appropriately. It will contain a significant component >> > that is MPL2 licensed. The terms that force us to include the link to the >> > Eigen source that we used forces downstream redistributors of the binary to >> > do the same. Now, of all the copyleft licenses, this is certainly the most >> > friendly, but it is not BSD. >> >> AFAICT, the only way redistributers could violate the MPL would be if they >> unpacked our binary and deleted the license file. But this would also be a >> violation of the BSD. The only difference in terms of requirements on >> redistributors between MPL and BSD seems to be exactly *which* text you >> include in your license file. I don't think even that would violate the MPL. The MPL says only that we - the distributors of binary code from an MPL project - must do this (3.1) "... inform recipients of the Executable Form how they can obtain a copy of such Source Code Form". It doesn't say we have to require the recipient to do the same [1], and it doesn't say that has to be in our license. I don't think it can mean that, because otherwise it would not make sense to say (in section 3.2) that we can "sublicense it under different terms, provided that the license for the Executable Form does not attempt to limit or alter the recipients' rights in the Source Code Form under this License.". The unmodified standard BSD license does not alter the recipients rights to the source code form of Eigen. >> I don't know if Eigen is a good choice on technical grounds (or even a >> possible one - has anyone ever actually compiled numpy against it?), but >> this license thing just doesn't seem like an important issue to me, if the >> alternative is not providing useful binaries. Am I correct in thinking we are all agreeing that it would be OK to distribute binary wheels for numpy from pypi, with compiled Eigen? See you, Matthew [1] "It's important to understand that the condition to distribute files under the MPL's terms only applies to the party that first creates and distributes the Larger Work." "https://www.gnu.org/licenses/license-list.html From matthew.brett at gmail.com Fri Mar 28 17:16:07 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 14:16:07 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Message-ID: Hi, On Fri, Mar 28, 2014 at 1:28 PM, Sturla Molden wrote: > Nathaniel Smith wrote: > >> If the only problem with eigen turns out to be that we have to add a line >> of text to a file then I think we can probably manage this somehow. > > We would also have to compile Eigen-BLAS for various architectures and CPU > counts. It is not "adaptive" like MKL or OpenBLAS. Yes, I guess we currently have no idea how bad a default Eigen would be. We also have the soft constraint that any choice we make should also work for building scipy binaries - so adequate lapack coverage. I believe that means lapack_lite is not an option? So I guess the options are: * eigen - could it be slow? * openblas - could it be buggy? * reference blas / lapack [1] [2] [3] In [2] someone seems to be getting very good performance from the reference implementation. I guess we need to benchmark these guys on some standard systems, and decide how bad the performance / stability has to be before it's better not to provide binaries at all. Cheers, Matthew [1] http://icl.cs.utk.edu/lapack-for-windows/lapack/ [2] http://ylzhao.blogspot.com/2013/10/blas-lapack-precompiled-binaries-for.html [3] http://www.fi.muni.cz/~xsvobod2/misc/lapack/ From njs at pobox.com Fri Mar 28 17:18:26 2014 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 28 Mar 2014 22:18:26 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Message-ID: I thought OpenBLAS is usually used with reference lapack? On 28 Mar 2014 22:16, "Matthew Brett" wrote: > Hi, > > On Fri, Mar 28, 2014 at 1:28 PM, Sturla Molden > wrote: > > Nathaniel Smith wrote: > > > >> If the only problem with eigen turns out to be that we have to add a > line > >> of text to a file then I think we can probably manage this somehow. > > > > We would also have to compile Eigen-BLAS for various architectures and > CPU > > counts. It is not "adaptive" like MKL or OpenBLAS. > > Yes, I guess we currently have no idea how bad a default Eigen would be. > > We also have the soft constraint that any choice we make should also > work for building scipy binaries - so adequate lapack coverage. > > I believe that means lapack_lite is not an option? > > So I guess the options are: > > * eigen - could it be slow? > * openblas - could it be buggy? > * reference blas / lapack [1] [2] [3] > > In [2] someone seems to be getting very good performance from the > reference implementation. > > I guess we need to benchmark these guys on some standard systems, and > decide how bad the performance / stability has to be before it's > better not to provide binaries at all. > > Cheers, > > Matthew > > [1] http://icl.cs.utk.edu/lapack-for-windows/lapack/ > [2] > http://ylzhao.blogspot.com/2013/10/blas-lapack-precompiled-binaries-for.html > [3] http://www.fi.muni.cz/~xsvobod2/misc/lapack/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Fri Mar 28 17:38:34 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Fri, 28 Mar 2014 22:38:34 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Message-ID: 2014-03-28 22:18 GMT+01:00 Nathaniel Smith : > I thought OpenBLAS is usually used with reference lapack? I am no longer sure myself. Debian & thus Ubuntu seem to be only packaging the BLAS part of OpenBLAS for the libblas.so symlink and uses the reference implementation of lapack for the liblapack.so symlink. I observed a sparse leastsqr bug when linking scipy against the full OpenBLAS under OSX that I could not reproduce under with the openblas + lapack combo shipped Ubuntu so there might be a difference. But that could also be caused by a version / platform discrepancy between my two setups. -- Olivier From jtaylor.debian at googlemail.com Fri Mar 28 17:55:44 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 28 Mar 2014 22:55:44 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Message-ID: <5335EFE0.3090504@googlemail.com> On 28.03.2014 22:38, Olivier Grisel wrote: > 2014-03-28 22:18 GMT+01:00 Nathaniel Smith : >> I thought OpenBLAS is usually used with reference lapack? > > I am no longer sure myself. Debian & thus Ubuntu seem to be only > packaging the BLAS part of OpenBLAS for the libblas.so symlink and > uses the reference implementation of lapack for the liblapack.so > symlink. You can link the reference lapack with any library providing a BLAS compatible API/ABI. ATLAS and OpenBlas are ABI compatible with reference BLAS which allows replacing the library via LD_PRELOAD or debian alternatives without recompiling. Both ATLAS and OpenBlas provide a subset of optimized lapack functions, but they are optional. On Debian/Ubuntu you can install ATLAS lapack but then you are also forced to use ATLAS blas. I am not familiar with how relevant the optimized parts of lapack for the general use case. > > I observed a sparse leastsqr bug when linking scipy against the full > OpenBLAS under OSX that I could not reproduce under with the openblas > + lapack combo shipped Ubuntu so there might be a difference. But that > could also be caused by a version / platform discrepancy between my > two setups. > what kind of a bug? wrong result or crash? Which target did openblas use when compiling on macos? or was it a dynamic build? (see the name of the built static library) The adaptive nature of OpenBLAS can make reproducing issues tricky, I don't think there is a way to force it to use a certain kernel at runtime besides recompiling for a different target. If you have a testcase I'd be interested in it, as I'm trying to get openblas into a decent shape for ubuntu 14.04. From olivier.grisel at ensta.org Fri Mar 28 18:05:42 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Fri, 28 Mar 2014 23:05:42 +0100 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <5335EFE0.3090504@googlemail.com> References: <841461679417710892.547329sturla.molden-gmail.com@news.gmane.org> <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> <5335EFE0.3090504@googlemail.com> Message-ID: 2014-03-28 22:55 GMT+01:00 Julian Taylor : > On 28.03.2014 22:38, Olivier Grisel wrote: >> 2014-03-28 22:18 GMT+01:00 Nathaniel Smith : >>> I thought OpenBLAS is usually used with reference lapack? >> >> I am no longer sure myself. Debian & thus Ubuntu seem to be only >> packaging the BLAS part of OpenBLAS for the libblas.so symlink and >> uses the reference implementation of lapack for the liblapack.so >> symlink. > > You can link the reference lapack with any library providing a BLAS > compatible API/ABI. ATLAS and OpenBlas are ABI compatible with reference > BLAS which allows replacing the library via LD_PRELOAD or debian > alternatives without recompiling. > > Both ATLAS and OpenBlas provide a subset of optimized lapack functions, > but they are optional. On Debian/Ubuntu you can install ATLAS lapack but > then you are also forced to use ATLAS blas. > > I am not familiar with how relevant the optimized parts of lapack for > the general use case. > >> >> I observed a sparse leastsqr bug when linking scipy against the full >> OpenBLAS under OSX that I could not reproduce under with the openblas >> + lapack combo shipped Ubuntu so there might be a difference. But that >> could also be caused by a version / platform discrepancy between my >> two setups. >> > > what kind of a bug? wrong result or crash? Here it is: https://github.com/scikit-learn/scikit-learn/issues/2986 I have not found the time to investigate yet. > Which target did openblas use > when compiling on macos? or was it a dynamic build? (see the name of the > built static library) I think I used target=NEHALEM that time (but not 100% sure). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From olivier.grisel at ensta.org Fri Mar 28 18:09:31 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Fri, 28 Mar 2014 23:09:31 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: This is great! Has anyone started to work on OSX whl packages for scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will not make it as easy as for numpy. Would it be possible to use a static gcc toolchain as Carl Kleffner is using for his experimental windows whl packages? -- Olivier From matthew.brett at gmail.com Fri Mar 28 18:13:59 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 15:13:59 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel wrote: > This is great! Has anyone started to work on OSX whl packages for > scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will > not make it as easy as for numpy. Would it be possible to use a static > gcc toolchain as Carl Kleffner is using for his experimental windows > whl packages? Yes, these are already done for the beta release, and for matplotlib: https://nipy.bic.berkeley.edu/scipy_installers/ Luckily OSX has a sensible way of setting relative paths to required libraries, so it's pretty easy to copy the required dlls into the binary distribution: https://github.com/matthew-brett/delocate Cheers, Matthew From jtaylor.debian at googlemail.com Fri Mar 28 18:14:30 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Fri, 28 Mar 2014 23:14:30 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: <5335F446.7070406@googlemail.com> On 28.03.2014 23:09, Olivier Grisel wrote: > This is great! Has anyone started to work on OSX whl packages for > scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will > not make it as easy as for numpy. Would it be possible to use a static > gcc toolchain as Carl Kleffner is using for his experimental windows > whl packages? > you can get rid of libgfortran and quadmath with the -static-libgfortran flag libgcc_s is probably more tricky as scipy uses c++ so -static-libgcc may need checking before using it doesn't mac provide libgcc_s anyway? Even though they have clang by default now, I doubt they can remove libgcc very soon. From sturla.molden at gmail.com Fri Mar 28 18:32:23 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 22:32:23 +0000 (UTC) Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe References: <1473349758417731165.921313sturla.molden-gmail.com@news.gmane.org> Message-ID: <730748148417738710.766397sturla.molden-gmail.com@news.gmane.org> Nathaniel Smith wrote: > I thought OpenBLAS is usually used with reference lapack? It is. From sturla.molden at gmail.com Fri Mar 28 19:02:36 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 23:02:36 +0000 (UTC) Subject: [Numpy-discussion] NumPy 1.8.1 release References: <5332135C.7040903@googlemail.com> <5335F446.7070406@googlemail.com> Message-ID: <347365958417740469.344443sturla.molden-gmail.com@news.gmane.org> Julian Taylor wrote: > On 28.03.2014 23:09, Olivier Grisel wrote: > you can get rid of libgfortran and quadmath with the -static-libgfortran > flag > libgcc_s is probably more tricky as scipy uses c++ so -static-libgcc may > need checking before using it > doesn't mac provide libgcc_s anyway? Even though they have clang by > default now, I doubt they can remove libgcc very soon. As of OS X 10.9 (Mavericks), -static-libgcc is not supported by the C compiler. Sturla From jtaylor.debian at googlemail.com Fri Mar 28 19:05:28 2014 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sat, 29 Mar 2014 00:05:28 +0100 Subject: [Numpy-discussion] NumPy 1.8.1 release In-Reply-To: <347365958417740469.344443sturla.molden-gmail.com@news.gmane.org> References: <5332135C.7040903@googlemail.com> <5335F446.7070406@googlemail.com> <347365958417740469.344443sturla.molden-gmail.com@news.gmane.org> Message-ID: <53360038.6050106@googlemail.com> On 29.03.2014 00:02, Sturla Molden wrote: > Julian Taylor wrote: >> On 28.03.2014 23:09, Olivier Grisel wrote: > >> you can get rid of libgfortran and quadmath with the -static-libgfortran >> flag >> libgcc_s is probably more tricky as scipy uses c++ so -static-libgcc may >> need checking before using it >> doesn't mac provide libgcc_s anyway? Even though they have clang by >> default now, I doubt they can remove libgcc very soon. > > As of OS X 10.9 (Mavericks), -static-libgcc is not supported by the C > compiler. > because the C compiler is not gcc, so obviously also no libgcc. 10.9 uses clang by default. But the library is still installed in the system (at least on the 10.9 macs I saw) From sturla.molden at gmail.com Fri Mar 28 19:17:29 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Fri, 28 Mar 2014 23:17:29 +0000 (UTC) Subject: [Numpy-discussion] NumPy 1.8.1 release References: <5332135C.7040903@googlemail.com> Message-ID: <622564271417741082.693270sturla.molden-gmail.com@news.gmane.org> Olivier Grisel wrote: > Would it be possible to use a static > gcc toolchain as Carl Kleffner is using for his experimental windows > whl packages? I think we should consider to device a Fortran to C99 translator. Sturla From sturla.molden at gmail.com Fri Mar 28 20:09:35 2014 From: sturla.molden at gmail.com (Sturla Molden) Date: Sat, 29 Mar 2014 01:09:35 +0100 Subject: [Numpy-discussion] NumPy 1.8.1 release In-Reply-To: <53360038.6050106@googlemail.com> References: <5332135C.7040903@googlemail.com> <5335F446.7070406@googlemail.com> <347365958417740469.344443sturla.molden-gmail.com@news.gmane.org> <53360038.6050106@googlemail.com> Message-ID: On 29/03/14 00:05, Julian Taylor wrote: > But the library is still installed in the system (at least on the 10.9 > macs I saw) > I only find it in the gfortran 4.8 I installed separately. Nowhere else. Sturla From matthew.brett at gmail.com Fri Mar 28 20:17:06 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 28 Mar 2014 17:17:06 -0700 Subject: [Numpy-discussion] NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> <5335F446.7070406@googlemail.com> <347365958417740469.344443sturla.molden-gmail.com@news.gmane.org> <53360038.6050106@googlemail.com> Message-ID: On Fri, Mar 28, 2014 at 5:09 PM, Sturla Molden wrote: > On 29/03/14 00:05, Julian Taylor wrote: > >> But the library is still installed in the system (at least on the 10.9 >> macs I saw) >> > > I only find it in the gfortran 4.8 I installed separately. Nowhere else. Have a look at the README for delocate: https://github.com/matthew-brett/delocate The worked example is scipy; you can see it copying these libs into the binary wheel: /usr/local/Cellar/gfortran/4.8.2/gfortran/lib/libgcc_s.1.dylib /usr/local/Cellar/gfortran/4.8.2/gfortran/lib/libgfortran.3.dylib /usr/local/Cellar/gfortran/4.8.2/gfortran/lib/libquadmath.0.dylib in this case from a homebrew installation. The resulting wheel is here: https://nipy.bic.berkeley.edu/scipy_installers/numpy-1.8.0-cp27-none-macosx_10_6_intel.whl If you do: pip install --upgrade pip pip install --pre --find-links https://nipy.bic.berkeley.edu/scipy_installers scipy you should find that you get a scipy version that passes its tests, even if you rename your libgcc file. Cheers, Matthew From jaime.frio at gmail.com Sat Mar 29 00:12:22 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Fri, 28 Mar 2014 21:12:22 -0700 Subject: [Numpy-discussion] Changes to np.vander Message-ID: Hi, I have submitted a PR (https://github.com/numpy/numpy/pull/4568) that speeds up `np.vander` by using accumulated multiplication instead of exponentiation to compute the Vandermonde matrix. For largish matrices the speed-ups can be quite dramatic, over an order of magnitude. Julian has raised concerns on numerical stability and loss of precision, which don't seem to be all that relevant. Do speak up if you think otherwise. We are also discussing replacing a recently added kwarg, "order", which now accepts a string, either "increasing" or "decreasing", to indicate the ordering of the matrix columns. This was not present in 1.8, so can still be modified. The proposal is to replace it with a "reversed" boolean flag. Unfortunately, the return of np.vander in 1.8 and before is the opposite (i.e. its reversed) from the standard definition, which has powers increasing from left to right. So it is not clear what the reversed keyword should refer to: 1. If it refers to the standard definition, then it would default to False for backwards compatibility, but be consistent with the conventional definition. 2. If it refers to the existing behavior of numpy's vander, then it would default to True, and not be consistent with the conventional definition. I prefer option 1, but would like to hear other's opinions. Which could of course include naming the boolean flag more ingeniously, or keeping the string flag. If he's reading, I'd specially like to hear Warren Weckesser's thoughts, as he is the one who added the "order" kwarg. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Mar 29 07:31:28 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 29 Mar 2014 07:31:28 -0400 Subject: [Numpy-discussion] Changes to np.vander In-Reply-To: References: Message-ID: On Sat, Mar 29, 2014 at 12:12 AM, Jaime Fern?ndez del R?o wrote: > Hi, > > I have submitted a PR (https://github.com/numpy/numpy/pull/4568) that speeds > up `np.vander` by using accumulated multiplication instead of exponentiation > to compute the Vandermonde matrix. For largish matrices the speed-ups can be > quite dramatic, over an order of magnitude. > > Julian has raised concerns on numerical stability and loss of precision, > which don't seem to be all that relevant. Do speak up if you think > otherwise. > > We are also discussing replacing a recently added kwarg, "order", which now > accepts a string, either "increasing" or "decreasing", to indicate the > ordering of the matrix columns. This was not present in 1.8, so can still be > modified. The proposal is to replace it with a "reversed" boolean flag. > Unfortunately, the return of np.vander in 1.8 and before is the opposite > (i.e. its reversed) from the standard definition, which has powers > increasing from left to right. So it is not clear what the reversed keyword > should refer to: > > 1. If it refers to the standard definition, then it would default to False > for backwards compatibility, but be consistent with the conventional > definition. > > 2. If it refers to the existing behavior of numpy's vander, then it would > default to True, and not be consistent with the conventional definition. > > I prefer option 1, but would like to hear other's opinions. Which could of > course include naming the boolean flag more ingeniously, or keeping the > string flag. If he's reading, I'd specially like to hear Warren Weckesser's > thoughts, as he is the one who added the "order" kwarg. "order" is not a good name, I would find it very confusing (I'm usually mixing up order and degree) http://en.wikipedia.org/wiki/Order_of_a_polynomial how about calling the keyword "increasing=False" ? which would avoid defining what it's reversed against. I don't know about precision loss. There's a nasty NIST problem for polynomial regression. If that doesn't change much, I wouldn't worry about differences in precision for statistical applications. But the problem is for the regression part, and might not be affected much by the vander precision. (Besides it's an example for "Don't do that at home.") http://jpktd.blogspot.ca/2012/03/numerical-accuracy-in-linear-least.html http://en.wikibooks.org/wiki/Statistics:Numerical_Methods/Numerical_Comparison_of_Statistical_Software#Linear_Regression Josef > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de > dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From josef.pktd at gmail.com Sat Mar 29 11:55:06 2014 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 29 Mar 2014 11:55:06 -0400 Subject: [Numpy-discussion] Changes to np.vander In-Reply-To: References: Message-ID: On Sat, Mar 29, 2014 at 7:31 AM, wrote: > On Sat, Mar 29, 2014 at 12:12 AM, Jaime Fern?ndez del R?o > wrote: >> Hi, >> >> I have submitted a PR (https://github.com/numpy/numpy/pull/4568) that speeds >> up `np.vander` by using accumulated multiplication instead of exponentiation >> to compute the Vandermonde matrix. For largish matrices the speed-ups can be >> quite dramatic, over an order of magnitude. >> >> Julian has raised concerns on numerical stability and loss of precision, >> which don't seem to be all that relevant. Do speak up if you think >> otherwise. >> >> We are also discussing replacing a recently added kwarg, "order", which now >> accepts a string, either "increasing" or "decreasing", to indicate the >> ordering of the matrix columns. This was not present in 1.8, so can still be >> modified. The proposal is to replace it with a "reversed" boolean flag. >> Unfortunately, the return of np.vander in 1.8 and before is the opposite >> (i.e. its reversed) from the standard definition, which has powers >> increasing from left to right. So it is not clear what the reversed keyword >> should refer to: >> >> 1. If it refers to the standard definition, then it would default to False >> for backwards compatibility, but be consistent with the conventional >> definition. >> >> 2. If it refers to the existing behavior of numpy's vander, then it would >> default to True, and not be consistent with the conventional definition. >> >> I prefer option 1, but would like to hear other's opinions. Which could of >> course include naming the boolean flag more ingeniously, or keeping the >> string flag. If he's reading, I'd specially like to hear Warren Weckesser's >> thoughts, as he is the one who added the "order" kwarg. > > "order" is not a good name, I would find it very confusing (I'm > usually mixing up order and degree) > http://en.wikipedia.org/wiki/Order_of_a_polynomial > > how about calling the keyword "increasing=False" ? > which would avoid defining what it's reversed against. Obviously I didn't read the PR before answering. But it shows that `increasing` might be obvious, or that Warren and I think the same way. Josef > > > I don't know about precision loss. > There's a nasty NIST problem for polynomial regression. If that > doesn't change much, I wouldn't worry about differences in precision > for statistical applications. > > But the problem is for the regression part, and might not be affected > much by the vander precision. (Besides it's an example for "Don't do > that at home.") > http://jpktd.blogspot.ca/2012/03/numerical-accuracy-in-linear-least.html > http://en.wikibooks.org/wiki/Statistics:Numerical_Methods/Numerical_Comparison_of_Statistical_Software#Linear_Regression > > Josef > >> >> Jaime >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de >> dominaci?n mundial. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> From njs at pobox.com Sat Mar 29 16:04:03 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Mar 2014 21:04:03 +0100 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: On Fri, Mar 28, 2014 at 9:30 PM, Sankarshan Mudkavi wrote: > > Hi Nathaniel, > > 1- You give as an example of "naive" datetime handling: > >>>> np.datetime64('2005-02-25T03:00Z') > np.datetime64('2005-02-25T03:00') > > This IIUC is incorrect. The Z modifier is a timezone offset, and for normal > "naive" datetimes would cause an error. > > > If what I understand from reading: > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 > > It looks like anything other than Z, 00:00 or UTC that has a TZ adjustment > would raise an error, and those specific conditions would not (I'm guessing > this is because we assume it's UTC (or the same timezone) internally, > anything that explicitly tells us it is UTC is acceptable, although that may > be just my misreading of it.) If we assume it's UTC, then that's proposal 2, I think :-). My point is just that "naive datetime" already has a specific meaning in Python, and as I understand that meaning, it says that trying to pass a Z timezone to a naive datetime should be an error. As a separate issue, we might decide that we want to continue to allow "Z" modifiers (or all offset modifiers) temporarily in numpy, to avoid breaking code without warning. Just if we do, then we shoudn't say that this is because we are implementing naive datetimes and this is how naive datetimes work. Instead we should either say that we're not implementing naive datetimes, or else say that we're implementing naive datetimes but have some temporary compatibility hacks on top of that (and probably issue a DeprecationWarning if anyone passes a timezone). -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org From chris.barker at noaa.gov Sat Mar 29 16:56:14 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 29 Mar 2014 13:56:14 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: On Sat, Mar 29, 2014 at 1:04 PM, Nathaniel Smith wrote: > > 1- You give as an example of "naive" datetime handling: > > > >>>> np.datetime64('2005-02-25T03:00Z') > > np.datetime64('2005-02-25T03:00') > > > > This IIUC is incorrect. The Z modifier is a timezone offset, and for > normal > > "naive" datetimes would cause an error. > I think this is somewhat open for discussion -- yes, it's odd, but in the spirit of practicality beats purity, it seems OK. We could allow any TZ specifier for that matter -- that's kind of how "naive" or "local" timezone (non) handling works -- it's up to the user to make sure that all DTs are in the same timezone. All it would be doing is tossing out some additional information that was in the ISO string. If we are explicitly calling it UTC-always, then anything other than Z or 00:00 (or nothing) would need to be converted. I think when it comes down to it, anything other than "proper" timezone handling will require these user-beware compromises. As a separate issue, we might decide that we want to continue to allow > "Z" modifiers (or all offset modifiers) temporarily in numpy, to avoid > breaking code without warning. Maybe the best tactic -- though it's broken enough now that I'm not sure it matters. A clear direction from here may be a better bet. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Sat Mar 29 18:08:48 2014 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 29 Mar 2014 23:08:48 +0100 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: On 29 Mar 2014 20:57, "Chris Barker" wrote: > I think this is somewhat open for discussion -- yes, it's odd, but in the spirit of practicality beats purity, it seems OK. We could allow any TZ specifier for that matter -- that's kind of how "naive" or "local" timezone (non) handling works -- it's up to the user to make sure that all DTs are in the same timezone. That isn't how naive timezone handling works in datetime.datetime, though. If you try to mix a timezone (even a Zulu timezone) datetime with a naive datetime, you get an exception. I agree this is open for discussion, but IMO deviating from the stdlib behavior this much would require some more justification. Don't let errors pass silently, etc. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Mar 29 21:07:39 2014 From: jaime.frio at gmail.com (=?ISO-8859-1?Q?Jaime_Fern=E1ndez_del_R=EDo?=) Date: Sat, 29 Mar 2014 18:07:39 -0700 Subject: [Numpy-discussion] Changes to np.vander In-Reply-To: References: Message-ID: On Sat, Mar 29, 2014 at 8:55 AM, wrote: > On Sat, Mar 29, 2014 at 7:31 AM, wrote: > > On Sat, Mar 29, 2014 at 12:12 AM, Jaime Fern?ndez del R?o > > wrote: > >> Hi, > >> > >> I have submitted a PR (https://github.com/numpy/numpy/pull/4568) that > speeds > >> up `np.vander` by using accumulated multiplication instead of > exponentiation > >> to compute the Vandermonde matrix. For largish matrices the speed-ups > can be > >> quite dramatic, over an order of magnitude. > >> > >> Julian has raised concerns on numerical stability and loss of precision, > >> which don't seem to be all that relevant. Do speak up if you think > >> otherwise. > >> > >> We are also discussing replacing a recently added kwarg, "order", which > now > >> accepts a string, either "increasing" or "decreasing", to indicate the > >> ordering of the matrix columns. This was not present in 1.8, so can > still be > >> modified. The proposal is to replace it with a "reversed" boolean flag. > >> Unfortunately, the return of np.vander in 1.8 and before is the opposite > >> (i.e. its reversed) from the standard definition, which has powers > >> increasing from left to right. So it is not clear what the reversed > keyword > >> should refer to: > >> > >> 1. If it refers to the standard definition, then it would default to > False > >> for backwards compatibility, but be consistent with the conventional > >> definition. > >> > >> 2. If it refers to the existing behavior of numpy's vander, then it > would > >> default to True, and not be consistent with the conventional definition. > >> > >> I prefer option 1, but would like to hear other's opinions. Which could > of > >> course include naming the boolean flag more ingeniously, or keeping the > >> string flag. If he's reading, I'd specially like to hear Warren > Weckesser's > >> thoughts, as he is the one who added the "order" kwarg. > > > > "order" is not a good name, I would find it very confusing (I'm > > usually mixing up order and degree) > > http://en.wikipedia.org/wiki/Order_of_a_polynomial > > > > how about calling the keyword "increasing=False" ? > > which would avoid defining what it's reversed against. > > Obviously I didn't read the PR before answering. > > But it shows that `increasing` might be obvious, or that Warren and I > think the same way. > Great minds think alike! It seems we have a 4 people consensus, "increasing" it is. Thanks for the feedback. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From glenn.caltech at gmail.com Sat Mar 29 21:13:37 2014 From: glenn.caltech at gmail.com (G Jones) Date: Sat, 29 Mar 2014 21:13:37 -0400 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 Message-ID: Hi, I am using netCDF4 to store complex data using the recommended strategy of creating a compound data type with the real and imaginary parts. This all works well, but reading the data into a numpy array is a bit clumsy. Typically I do: nc = netCDF4.Dataset('my.nc') cplx_data = nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') which directly gives a nice complex numpy array. This is OK for small arrays, but is wasteful if I only need some chunks of the array because it reads all the data in, reducing the utility of the mmap feature of netCDF. I'm wondering if there is a better way to directly make a numpy array view that uses the netcdf variable's memory mapped buffer directly. Looking at the Variable class, there is no access to this buffer directly which could then be passed to np.ndarray(buffer=...). Any ideas of simple solutions to this problem? Thanks, Glenn -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sat Mar 29 22:59:48 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 29 Mar 2014 19:59:48 -0700 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 In-Reply-To: References: Message-ID: Hi Glenn, My usual strategy for this sort of thing is to make a light-weight wrapper class which reads and converts values when you access them. For example: class WrapComplex(object): def __init__(self, nc_var): self.nc_var = nc_var def __getitem__(self, item): return self.nc_var[item].view('complex') nc = netCDF4.Dataset('my.nc') cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) Now you can index cplx_data (e.g., cplx_data[:10]) and only the values you need will be read from disk and converted on the fly. Hope this helps! Cheers, Stephan On Sat, Mar 29, 2014 at 6:13 PM, G Jones wrote: > Hi, > I am using netCDF4 to store complex data using the recommended strategy of > creating a compound data type with the real and imaginary parts. This all > works well, but reading the data into a numpy array is a bit clumsy. > > Typically I do: > > nc = netCDF4.Dataset('my.nc') > cplx_data = nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') > > which directly gives a nice complex numpy array. This is OK for small > arrays, but is wasteful if I only need some chunks of the array because it > reads all the data in, reducing the utility of the mmap feature of netCDF. > > I'm wondering if there is a better way to directly make a numpy array view > that uses the netcdf variable's memory mapped buffer directly. Looking at > the Variable class, there is no access to this buffer directly which could > then be passed to np.ndarray(buffer=...). > > Any ideas of simple solutions to this problem? > > Thanks, > Glenn > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glenn.caltech at gmail.com Sat Mar 29 23:42:43 2014 From: glenn.caltech at gmail.com (G Jones) Date: Sat, 29 Mar 2014 23:42:43 -0400 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 In-Reply-To: References: Message-ID: Hi Stephan, Thanks for the reply. I was thinking of something along these lines but was hesitant because while this provides clean access to chunks of the data, you still have to remember to do cplx_data[:].mean() for example in the case that you want cplx_data.mean(). I was hoping to basically have all of the ndarray methods at hand without any indexing, but then also being smart about taking advantage of the mmap when possible. But perhaps your solution is the best compromise. Thanks again, Glenn On Mar 29, 2014 10:59 PM, "Stephan Hoyer" wrote: > Hi Glenn, > > My usual strategy for this sort of thing is to make a light-weight wrapper > class which reads and converts values when you access them. For example: > > class WrapComplex(object): > def __init__(self, nc_var): > self.nc_var = nc_var > > def __getitem__(self, item): > return self.nc_var[item].view('complex') > > nc = netCDF4.Dataset('my.nc') > cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) > > Now you can index cplx_data (e.g., cplx_data[:10]) and only the values you > need will be read from disk and converted on the fly. > > Hope this helps! > > Cheers, > Stephan > > > > > On Sat, Mar 29, 2014 at 6:13 PM, G Jones wrote: > >> Hi, >> I am using netCDF4 to store complex data using the recommended strategy >> of creating a compound data type with the real and imaginary parts. This >> all works well, but reading the data into a numpy array is a bit clumsy. >> >> Typically I do: >> >> nc = netCDF4.Dataset('my.nc') >> cplx_data = >> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >> >> which directly gives a nice complex numpy array. This is OK for small >> arrays, but is wasteful if I only need some chunks of the array because it >> reads all the data in, reducing the utility of the mmap feature of netCDF. >> >> I'm wondering if there is a better way to directly make a numpy array >> view that uses the netcdf variable's memory mapped buffer directly. Looking >> at the Variable class, there is no access to this buffer directly which >> could then be passed to np.ndarray(buffer=...). >> >> Any ideas of simple solutions to this problem? >> >> Thanks, >> Glenn >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Mar 30 02:18:23 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Sat, 29 Mar 2014 23:18:23 -0700 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 In-Reply-To: References: Message-ID: Hi Glenn, Here is a full example of how we wrap a netCDF4.Variable object, implementing all of its ndarray-like methods: https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91 The __array__ method would be the most relevant one for you: it means that numpy knows how to convert the wrapper array into a numpy.ndarray when you call np.mean(cplx_data). More generally, any function that calls np.asarray(cplx_data) will properly convert the values, which should include most functions from well-written libraries (including numpy and scipy). netCDF4.Variable doesn't currently have such an __array__ method, but it will in the next released version of the library. The quick and dirty hack to make all numpy methods work (now going beyond what the netCDF4 library implements) would be to add something like the following: def __getattr__(self, attr): return getattr(np.asarray(self), attr) But this is a little dangerous, since some methods might silently fail or give unpredictable results (e.g., those that modify data). It would be safer to list the methods you want to implement explicitly, or to just liberally use np.asarray. The later is generally a good practice when writing library code, anyways, to catch unusual ndarray subclasses like np.matrix. Stephan On Sat, Mar 29, 2014 at 8:42 PM, G Jones wrote: > Hi Stephan, > Thanks for the reply. I was thinking of something along these lines but > was hesitant because while this provides clean access to chunks of the > data, you still have to remember to do cplx_data[:].mean() for example in > the case that you want cplx_data.mean(). > > I was hoping to basically have all of the ndarray methods at hand without > any indexing, but then also being smart about taking advantage of the mmap > when possible. But perhaps your solution is the best compromise. > > Thanks again, > Glenn > On Mar 29, 2014 10:59 PM, "Stephan Hoyer" wrote: > >> Hi Glenn, >> >> My usual strategy for this sort of thing is to make a light-weight >> wrapper class which reads and converts values when you access them. For >> example: >> >> class WrapComplex(object): >> def __init__(self, nc_var): >> self.nc_var = nc_var >> >> def __getitem__(self, item): >> return self.nc_var[item].view('complex') >> >> nc = netCDF4.Dataset('my.nc') >> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) >> >> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values >> you need will be read from disk and converted on the fly. >> >> Hope this helps! >> >> Cheers, >> Stephan >> >> >> >> >> On Sat, Mar 29, 2014 at 6:13 PM, G Jones wrote: >> >>> Hi, >>> I am using netCDF4 to store complex data using the recommended strategy >>> of creating a compound data type with the real and imaginary parts. This >>> all works well, but reading the data into a numpy array is a bit clumsy. >>> >>> Typically I do: >>> >>> nc = netCDF4.Dataset('my.nc') >>> cplx_data = >>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >>> >>> which directly gives a nice complex numpy array. This is OK for small >>> arrays, but is wasteful if I only need some chunks of the array because it >>> reads all the data in, reducing the utility of the mmap feature of netCDF. >>> >>> I'm wondering if there is a better way to directly make a numpy array >>> view that uses the netcdf variable's memory mapped buffer directly. Looking >>> at the Variable class, there is no access to this buffer directly which >>> could then be passed to np.ndarray(buffer=...). >>> >>> Any ideas of simple solutions to this problem? >>> >>> Thanks, >>> Glenn >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glenn.caltech at gmail.com Sun Mar 30 08:18:56 2014 From: glenn.caltech at gmail.com (G Jones) Date: Sun, 30 Mar 2014 08:18:56 -0400 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 In-Reply-To: References: Message-ID: Hi, This looks useful. What you said about __array__ makes sense, but I didn't see it in the code you linked. Do you know when python netcdf4 will support the numpy array interface directly? I searched around for a roadmap but didn't find anything. It may be best for me to proceed with a slightly clumsy interface for now and wait until the array interface is built in for free. Thanks, Glenn On Mar 30, 2014 2:18 AM, "Stephan Hoyer" wrote: > Hi Glenn, > > Here is a full example of how we wrap a netCDF4.Variable object, > implementing all of its ndarray-like methods: > > https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91 > > The __array__ method would be the most relevant one for you: it means that > numpy knows how to convert the wrapper array into a numpy.ndarray when you > call np.mean(cplx_data). More generally, any function that calls > np.asarray(cplx_data) will properly convert the values, which should > include most functions from well-written libraries (including numpy and > scipy). netCDF4.Variable doesn't currently have such an __array__ method, > but it will in the next released version of the library. > > The quick and dirty hack to make all numpy methods work (now going beyond > what the netCDF4 library implements) would be to add something like the > following: > > def __getattr__(self, attr): > return getattr(np.asarray(self), attr) > > But this is a little dangerous, since some methods might silently fail or > give unpredictable results (e.g., those that modify data). It would be > safer to list the methods you want to implement explicitly, or to just > liberally use np.asarray. The later is generally a good practice when > writing library code, anyways, to catch unusual ndarray subclasses like > np.matrix. > > Stephan > > > On Sat, Mar 29, 2014 at 8:42 PM, G Jones wrote: > >> Hi Stephan, >> Thanks for the reply. I was thinking of something along these lines but >> was hesitant because while this provides clean access to chunks of the >> data, you still have to remember to do cplx_data[:].mean() for example in >> the case that you want cplx_data.mean(). >> >> I was hoping to basically have all of the ndarray methods at hand without >> any indexing, but then also being smart about taking advantage of the mmap >> when possible. But perhaps your solution is the best compromise. >> >> Thanks again, >> Glenn >> On Mar 29, 2014 10:59 PM, "Stephan Hoyer" wrote: >> >>> Hi Glenn, >>> >>> My usual strategy for this sort of thing is to make a light-weight >>> wrapper class which reads and converts values when you access them. For >>> example: >>> >>> class WrapComplex(object): >>> def __init__(self, nc_var): >>> self.nc_var = nc_var >>> >>> def __getitem__(self, item): >>> return self.nc_var[item].view('complex') >>> >>> nc = netCDF4.Dataset('my.nc') >>> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) >>> >>> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values >>> you need will be read from disk and converted on the fly. >>> >>> Hope this helps! >>> >>> Cheers, >>> Stephan >>> >>> >>> >>> >>> On Sat, Mar 29, 2014 at 6:13 PM, G Jones wrote: >>> >>>> Hi, >>>> I am using netCDF4 to store complex data using the recommended strategy >>>> of creating a compound data type with the real and imaginary parts. This >>>> all works well, but reading the data into a numpy array is a bit clumsy. >>>> >>>> Typically I do: >>>> >>>> nc = netCDF4.Dataset('my.nc') >>>> cplx_data = >>>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >>>> >>>> which directly gives a nice complex numpy array. This is OK for small >>>> arrays, but is wasteful if I only need some chunks of the array because it >>>> reads all the data in, reducing the utility of the mmap feature of netCDF. >>>> >>>> I'm wondering if there is a better way to directly make a numpy array >>>> view that uses the netcdf variable's memory mapped buffer directly. Looking >>>> at the Variable class, there is no access to this buffer directly which >>>> could then be passed to np.ndarray(buffer=...). >>>> >>>> Any ideas of simple solutions to this problem? >>>> >>>> Thanks, >>>> Glenn >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Sun Mar 30 19:33:19 2014 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 30 Mar 2014 16:33:19 -0700 Subject: [Numpy-discussion] Transparently reading complex arrays from netcdf4 In-Reply-To: References: Message-ID: Hi Glenn, Here is the line in my linked code defining the __array__ method: https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L152 I don't know when Jeff Whitaker will be releasing the next version of netCDF4, but I expect that might be pretty soon if you asked nicely! Otherwise you can always download the development version off of github: https://github.com/Unidata/netcdf4-python Cheers, Stephan On Sun, Mar 30, 2014 at 5:18 AM, G Jones wrote: > Hi, > This looks useful. What you said about __array__ makes sense, but I didn't > see it in the code you linked. > Do you know when python netcdf4 will support the numpy array interface > directly? I searched around for a roadmap but didn't find anything. It may > be best for me to proceed with a slightly clumsy interface for now and wait > until the array interface is built in for free. > > Thanks, > Glenn > On Mar 30, 2014 2:18 AM, "Stephan Hoyer" wrote: > >> Hi Glenn, >> >> Here is a full example of how we wrap a netCDF4.Variable object, >> implementing all of its ndarray-like methods: >> >> https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91 >> >> The __array__ method would be the most relevant one for you: it means >> that numpy knows how to convert the wrapper array into a numpy.ndarray when >> you call np.mean(cplx_data). More generally, any function that calls >> np.asarray(cplx_data) will properly convert the values, which should >> include most functions from well-written libraries (including numpy and >> scipy). netCDF4.Variable doesn't currently have such an __array__ method, >> but it will in the next released version of the library. >> >> The quick and dirty hack to make all numpy methods work (now going beyond >> what the netCDF4 library implements) would be to add something like the >> following: >> >> def __getattr__(self, attr): >> return getattr(np.asarray(self), attr) >> >> But this is a little dangerous, since some methods might silently fail or >> give unpredictable results (e.g., those that modify data). It would be >> safer to list the methods you want to implement explicitly, or to just >> liberally use np.asarray. The later is generally a good practice when >> writing library code, anyways, to catch unusual ndarray subclasses like >> np.matrix. >> >> Stephan >> >> >> On Sat, Mar 29, 2014 at 8:42 PM, G Jones wrote: >> >>> Hi Stephan, >>> Thanks for the reply. I was thinking of something along these lines but >>> was hesitant because while this provides clean access to chunks of the >>> data, you still have to remember to do cplx_data[:].mean() for example in >>> the case that you want cplx_data.mean(). >>> >>> I was hoping to basically have all of the ndarray methods at hand >>> without any indexing, but then also being smart about taking advantage of >>> the mmap when possible. But perhaps your solution is the best compromise. >>> >>> Thanks again, >>> Glenn >>> On Mar 29, 2014 10:59 PM, "Stephan Hoyer" wrote: >>> >>>> Hi Glenn, >>>> >>>> My usual strategy for this sort of thing is to make a light-weight >>>> wrapper class which reads and converts values when you access them. For >>>> example: >>>> >>>> class WrapComplex(object): >>>> def __init__(self, nc_var): >>>> self.nc_var = nc_var >>>> >>>> def __getitem__(self, item): >>>> return self.nc_var[item].view('complex') >>>> >>>> nc = netCDF4.Dataset('my.nc') >>>> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) >>>> >>>> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values >>>> you need will be read from disk and converted on the fly. >>>> >>>> Hope this helps! >>>> >>>> Cheers, >>>> Stephan >>>> >>>> >>>> >>>> >>>> On Sat, Mar 29, 2014 at 6:13 PM, G Jones wrote: >>>> >>>>> Hi, >>>>> I am using netCDF4 to store complex data using the recommended >>>>> strategy of creating a compound data type with the real and imaginary >>>>> parts. This all works well, but reading the data into a numpy array is a >>>>> bit clumsy. >>>>> >>>>> Typically I do: >>>>> >>>>> nc = netCDF4.Dataset('my.nc') >>>>> cplx_data = >>>>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >>>>> >>>>> which directly gives a nice complex numpy array. This is OK for small >>>>> arrays, but is wasteful if I only need some chunks of the array because it >>>>> reads all the data in, reducing the utility of the mmap feature of netCDF. >>>>> >>>>> I'm wondering if there is a better way to directly make a numpy array >>>>> view that uses the netcdf variable's memory mapped buffer directly. Looking >>>>> at the Variable class, there is no access to this buffer directly which >>>>> could then be passed to np.ndarray(buffer=...). >>>>> >>>>> Any ideas of simple solutions to this problem? >>>>> >>>>> Thanks, >>>>> Glenn >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olivier.grisel at ensta.org Mon Mar 31 07:53:11 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 31 Mar 2014 13:53:11 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: 2014-03-28 23:13 GMT+01:00 Matthew Brett : > Hi, > > On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel > wrote: >> This is great! Has anyone started to work on OSX whl packages for >> scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will >> not make it as easy as for numpy. Would it be possible to use a static >> gcc toolchain as Carl Kleffner is using for his experimental windows >> whl packages? > > Yes, these are already done for the beta release, and for matplotlib: > > https://nipy.bic.berkeley.edu/scipy_installers/ > > Luckily OSX has a sensible way of setting relative paths to required > libraries, so it's pretty easy to copy the required dlls into the > binary distribution: > > https://github.com/matthew-brett/delocate Great! Do you think it would be possible to upload such a delocated .whl package for scipy 0.13.3 on pypi if all tests pass? Bonus question: do you think a similar solution could work for windows and / or linux? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From olivier.grisel at ensta.org Mon Mar 31 08:17:18 2014 From: olivier.grisel at ensta.org (Olivier Grisel) Date: Mon, 31 Mar 2014 14:17:18 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: 2014-03-31 13:53 GMT+02:00 Olivier Grisel : > 2014-03-28 23:13 GMT+01:00 Matthew Brett : >> Hi, >> >> On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel >> wrote: >>> This is great! Has anyone started to work on OSX whl packages for >>> scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will >>> not make it as easy as for numpy. Would it be possible to use a static >>> gcc toolchain as Carl Kleffner is using for his experimental windows >>> whl packages? >> >> Yes, these are already done for the beta release, and for matplotlib: >> >> https://nipy.bic.berkeley.edu/scipy_installers/ >> >> Luckily OSX has a sensible way of setting relative paths to required >> libraries, so it's pretty easy to copy the required dlls into the >> binary distribution: >> >> https://github.com/matthew-brett/delocate > > Great! Do you think it would be possible to upload such a delocated > .whl package for scipy 0.13.3 on pypi if all tests pass? I built such a whl package for the v0.13.3 tag of scipy, delocated it, "brew uninstall gfortran" to make sure that the dynlib loader would not be able to find the system libs, installed the resulting whl package in a new virtualenv and ran the tests: $ python -c "import scipy; scipy.test()" [...] Ran 8775 tests in 123.315s OK (KNOWNFAIL=113, SKIP=221) This is built on OSX 10.9. You can find the resulting wheel package on my dropbox: https://dl.dropboxusercontent.com/u/5743203/sklearn/wheelhouse/scipy-0.13.3-cp34-cp34m-macosx_10_6_intel.whl If scipy maintainers would like to upload such wheel packages for scipy 0.13.3 I can also prepare them for Python 2.7 and Python 3.3. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel From matthew.brett at gmail.com Mon Mar 31 12:30:21 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 31 Mar 2014 09:30:21 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Mon, Mar 31, 2014 at 5:17 AM, Olivier Grisel wrote: > 2014-03-31 13:53 GMT+02:00 Olivier Grisel : >> 2014-03-28 23:13 GMT+01:00 Matthew Brett : >>> Hi, >>> >>> On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel >>> wrote: >>>> This is great! Has anyone started to work on OSX whl packages for >>>> scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will >>>> not make it as easy as for numpy. Would it be possible to use a static >>>> gcc toolchain as Carl Kleffner is using for his experimental windows >>>> whl packages? >>> >>> Yes, these are already done for the beta release, and for matplotlib: >>> >>> https://nipy.bic.berkeley.edu/scipy_installers/ >>> >>> Luckily OSX has a sensible way of setting relative paths to required >>> libraries, so it's pretty easy to copy the required dlls into the >>> binary distribution: >>> >>> https://github.com/matthew-brett/delocate >> >> Great! Do you think it would be possible to upload such a delocated >> .whl package for scipy 0.13.3 on pypi if all tests pass? > > I built such a whl package for the v0.13.3 tag of scipy, delocated it, > "brew uninstall gfortran" to make sure that the dynlib loader would > not be able to find the system libs, installed the resulting whl > package in a new virtualenv and ran the tests: > > $ python -c "import scipy; scipy.test()" > [...] > Ran 8775 tests in 123.315s > > OK (KNOWNFAIL=113, SKIP=221) > > This is built on OSX 10.9. You can find the resulting wheel package on > my dropbox: > > https://dl.dropboxusercontent.com/u/5743203/sklearn/wheelhouse/scipy-0.13.3-cp34-cp34m-macosx_10_6_intel.whl > > If scipy maintainers would like to upload such wheel packages for > scipy 0.13.3 I can also prepare them for Python 2.7 and Python 3.3. Thanks for doing those checks. Yes, I think it would be good to upload the scipy wheels, if nothing else they'd allow us to get early warning of any problems. Ralf, Pauli - any objections to uploading binary wheels for 0.13.3? I'm will test on a clean 10.6 installation before I upload them. Cheers, Matthew From ralf.gommers at gmail.com Mon Mar 31 13:14:14 2014 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 31 Mar 2014 19:14:14 +0200 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: On Mon, Mar 31, 2014 at 6:30 PM, Matthew Brett wrote: > Hi, > > On Mon, Mar 31, 2014 at 5:17 AM, Olivier Grisel > wrote: > > 2014-03-31 13:53 GMT+02:00 Olivier Grisel : > >> 2014-03-28 23:13 GMT+01:00 Matthew Brett : > >>> Hi, > >>> > >>> On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel > >>> wrote: > >>>> This is great! Has anyone started to work on OSX whl packages for > >>>> scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will > >>>> not make it as easy as for numpy. Would it be possible to use a static > >>>> gcc toolchain as Carl Kleffner is using for his experimental windows > >>>> whl packages? > >>> > >>> Yes, these are already done for the beta release, and for matplotlib: > >>> > >>> https://nipy.bic.berkeley.edu/scipy_installers/ > >>> > >>> Luckily OSX has a sensible way of setting relative paths to required > >>> libraries, so it's pretty easy to copy the required dlls into the > >>> binary distribution: > >>> > >>> https://github.com/matthew-brett/delocate > >> > >> Great! Do you think it would be possible to upload such a delocated > >> .whl package for scipy 0.13.3 on pypi if all tests pass? > > > > I built such a whl package for the v0.13.3 tag of scipy, delocated it, > > "brew uninstall gfortran" to make sure that the dynlib loader would > > not be able to find the system libs, installed the resulting whl > > package in a new virtualenv and ran the tests: > > > > $ python -c "import scipy; scipy.test()" > > [...] > > Ran 8775 tests in 123.315s > > > > OK (KNOWNFAIL=113, SKIP=221) > > > > This is built on OSX 10.9. You can find the resulting wheel package on > > my dropbox: > > > > > https://dl.dropboxusercontent.com/u/5743203/sklearn/wheelhouse/scipy-0.13.3-cp34-cp34m-macosx_10_6_intel.whl > > > > If scipy maintainers would like to upload such wheel packages for > > scipy 0.13.3 I can also prepare them for Python 2.7 and Python 3.3. > > Thanks for doing those checks. Yes, I think it would be good to > upload the scipy wheels, if nothing else they'd allow us to get early > warning of any problems. > > Ralf, Pauli - any objections to uploading binary wheels for 0.13.3? > I'm will test on a clean 10.6 installation before I upload them. > No objections, looks like a good idea to me. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Mar 31 13:18:26 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 31 Mar 2014 10:18:26 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Mon, Mar 31, 2014 at 4:53 AM, Olivier Grisel wrote: > 2014-03-28 23:13 GMT+01:00 Matthew Brett : >> Hi, >> >> On Fri, Mar 28, 2014 at 3:09 PM, Olivier Grisel >> wrote: >>> This is great! Has anyone started to work on OSX whl packages for >>> scipy? I assume the libgfortran, libquadmath & libgcc_s dylibs will >>> not make it as easy as for numpy. Would it be possible to use a static >>> gcc toolchain as Carl Kleffner is using for his experimental windows >>> whl packages? >> >> Yes, these are already done for the beta release, and for matplotlib: >> >> https://nipy.bic.berkeley.edu/scipy_installers/ >> >> Luckily OSX has a sensible way of setting relative paths to required >> libraries, so it's pretty easy to copy the required dlls into the >> binary distribution: >> >> https://github.com/matthew-brett/delocate > > Great! Do you think it would be possible to upload such a delocated > .whl package for scipy 0.13.3 on pypi if all tests pass? > > Bonus question: do you think a similar solution could work for windows > and / or linux? For linux - yes - I think that should be easy with a combination of ``ldd`` to find the dependencies and ``patchelf`` to set the rpath to point to the copied library. For Windows - I believe that it is not possible to set relative paths for windows DLLs, but I'd be very happy to be corrected. There is a function SetDllDirectory [1], but this would need some extra extension code in the package. Windows experts - is that an option? Cheers, Matthew [1] http://msdn.microsoft.com/en-us/library/ms686203(VS.85).aspx From chris.barker at noaa.gov Mon Mar 31 14:46:17 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 31 Mar 2014 11:46:17 -0700 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: On Sat, Mar 29, 2014 at 3:08 PM, Nathaniel Smith wrote: > On 29 Mar 2014 20:57, "Chris Barker" wrote: > > I think this is somewhat open for discussion -- yes, it's odd, but in > the spirit of practicality beats purity, it seems OK. We could allow any TZ > specifier for that matter -- that's kind of how "naive" or "local" timezone > (non) handling works -- it's up to the user to make sure that all DTs are > in the same timezone. > > That isn't how naive timezone handling works in datetime.datetime, though. > If you try to mix a timezone (even a Zulu timezone) datetime with a naive > datetime, you get an exception. > fari enough. The difference is that datetime.datetime doesn't provide any iso string parsing. The use case I'm imagining is for folks with ISO strings with a Z on the end -- they'll need to deal with pre-parsing the strings to strip off the Z, when it wouldn't change the result. Maybe this is an argument for "UTC always" rather than "naive"? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Mar 31 14:55:45 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 31 Mar 2014 11:55:45 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: On Mon, Mar 31, 2014 at 10:18 AM, Matthew Brett wrote: > > Bonus question: do you think a similar solution could work for windows > > and / or linux? > > For linux - yes - I think that should be easy with a combination of > ``ldd`` to find the dependencies and ``patchelf`` to set the rpath to > point to the copied library. > that part, yes, but isn't Linux too much of a varying target for there to be any point anyway? > For Windows - I believe that it is not possible to set relative paths > for windows DLLs, but I'd be very happy to be corrected. There is a > function SetDllDirectory [1], but this would need some extra extension > code in the package. Windows experts - is that an option? > The "usual" way is to put the dll next to where it is needed. I _think_ a when a one dll (the pyton extension) is linked to another one, the first place windows looks is right next to the one loading it -- same as for dlls linked to main executables. Unfortunately, anywehre else and all bets are off -- I was fighting with this a while back and found what I think is the source of "DLL Hell" -- it's the combination of these two: 1) Windows looks next to the executable for dll. 2) The search PATH for executables and dlls is the same. So some folks put dlls next to the executable And other folks get bit because the search PATH finds dlls next to unrelated executables. The python.org python install has a DLLs directory: C:\Python27\DLLs Maybe putting them there with nice long, non-standard names would work. Has anyone looked at how Anaconda does it? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Mar 31 15:05:40 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 31 Mar 2014 12:05:40 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Mon, Mar 31, 2014 at 11:55 AM, Chris Barker wrote: > On Mon, Mar 31, 2014 at 10:18 AM, Matthew Brett > wrote: >> >> > Bonus question: do you think a similar solution could work for windows >> > and / or linux? >> >> For linux - yes - I think that should be easy with a combination of >> ``ldd`` to find the dependencies and ``patchelf`` to set the rpath to >> point to the copied library. > > > that part, yes, but isn't Linux too much of a varying target for there to be > any point anyway? You mean, the /usr/lib stuff varies too much, so that any copied dynamic libraries would have little chance of binary compatibility with the system libs? >> For Windows - I believe that it is not possible to set relative paths >> for windows DLLs, but I'd be very happy to be corrected. There is a >> function SetDllDirectory [1], but this would need some extra extension >> code in the package. Windows experts - is that an option? > > > The "usual" way is to put the dll next to where it is needed. I _think_ a > when a one dll (the pyton extension) is linked to another one, the first > place windows looks is right next to the one loading it -- same as for dlls > linked to main executables. I had assumed from [1] is that it's the path of the executable not the loading DLL that is on the DLL search path, but I might well be wrong I guess, if it was the path of the loading DLL, you'd run into trouble because you'd likely have python extensions in several directories, and then you'd need to copy the dependencies into all of them. > Unfortunately, anywehre else and all bets are off -- I was fighting with > this a while back and found what I think is the source of "DLL Hell" -- it's > the combination of these two: > > 1) Windows looks next to the executable for dll. > 2) The search PATH for executables and dlls is the same. > > So some folks put dlls next to the executable > And other folks get bit because the search PATH finds dlls next to unrelated > executables. > > The python.org python install has a DLLs directory: > > C:\Python27\DLLs > > Maybe putting them there with nice long, non-standard names would work. Sounds reasonable to me. > Has anyone looked at how Anaconda does it? Not me - would be interested to know. Cheers, Matthew [1] http://msdn.microsoft.com/en-us/library/ms682586(v=vs.85).aspx#standard_search_order_for_desktop_applications From chris.barker at noaa.gov Mon Mar 31 15:27:19 2014 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 31 Mar 2014 12:27:19 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: On Mon, Mar 31, 2014 at 12:05 PM, Matthew Brett wrote: > > that part, yes, but isn't Linux too much of a varying target for there > to be > > any point anyway? > > You mean, the /usr/lib stuff varies too much, so that any copied > dynamic libraries would have little chance of binary compatibility > with the system libs? exactly. > The "usual" way is to put the dll next to where it is needed. I _think_ a > > when a one dll (the pyton extension) is linked to another one, the first > > place windows looks is right next to the one loading it -- same as for > dlls > > linked to main executables. > > I had assumed from [1] is that it's the path of the executable not the > loading DLL that is on the DLL search path, but I might well be wrong > I could be wring, too -- I'm pretty sure I tested this at some point, but It could be getting lost in the fog of memory. I guess, if it was the path of the loading DLL, you'd run into trouble > because you'd likely have python extensions in several directories, > and then you'd need to copy the dependencies into all of them. yup -- not ideal > > The python.org python install has a DLLs directory: > > > > C:\Python27\DLLs > > > > Maybe putting them there with nice long, non-standard names would work. > > Sounds reasonable to me. on that note -- looking at my Python install on Windows, I don't see "C:\Python27\DLLs" in PATH. So there must be some run-time way to tel Windows to look there. Maybe that could be leveraged. This may be a question for distutils-sig or something.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Mon Mar 31 18:09:04 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 31 Mar 2014 15:09:04 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.8.1 release In-Reply-To: References: <5332135C.7040903@googlemail.com> Message-ID: Hi, On Mon, Mar 31, 2014 at 12:27 PM, Chris Barker wrote: > On Mon, Mar 31, 2014 at 12:05 PM, Matthew Brett > wrote: >> >> > that part, yes, but isn't Linux too much of a varying target for there >> > to be >> > any point anyway? >> >> You mean, the /usr/lib stuff varies too much, so that any copied >> dynamic libraries would have little chance of binary compatibility >> with the system libs? > > > exactly. > >> > The "usual" way is to put the dll next to where it is needed. I _think_ >> > a >> > when a one dll (the pyton extension) is linked to another one, the first >> > place windows looks is right next to the one loading it -- same as for >> > dlls >> > linked to main executables. >> >> I had assumed from [1] is that it's the path of the executable not the >> loading DLL that is on the DLL search path, but I might well be wrong > > > I could be wring, too -- I'm pretty sure I tested this at some point, but It > could be getting lost in the fog of memory. I am hopelessly lost here, but it looks as though Python extension modules get loaded via hDLL = LoadLibraryEx(pathname, NULL, LOAD_WITH_ALTERED_SEARCH_PATH); See: http://hg.python.org/cpython/file/3a1db0d2747e/Python/dynload_win.c#l195 I think this means that the first directory on the search path is indeed the path containing the extension module: http://msdn.microsoft.com/en-us/library/windows/desktop/ms682586(v=vs.85).aspx#alternate_search_order_for_desktop_applications So I'm guessing that it would not work putting DLLs into the 'DLLs' directory - unless the extension modules went in there too. I _think_ (David ?) this means it would not work to copy the dependent DLLs into sys.exec_prefix Cheers, Matthew From matthew.brett at gmail.com Mon Mar 31 20:59:52 2014 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 31 Mar 2014 17:59:52 -0700 Subject: [Numpy-discussion] Default builds of OpenBLAS development branch are now fork safe In-Reply-To: <53331DBF.6020504@googlemail.com> References: <1225660970414595360.835902sturla.molden-gmail.com@news.gmane.org> <53331DBF.6020504@googlemail.com> Message-ID: Hi, On Wed, Mar 26, 2014 at 11:34 AM, Julian Taylor wrote: > On 26.03.2014 16:27, Olivier Grisel wrote: >> Hi Carl, >> >> I installed Python 2.7.6 64 bits on a windows server instance from >> rackspace cloud and then ran get-pip.py and then could successfully >> install the numpy and scipy wheel packages from your google drive >> folder. I tested dot products and scipy.linalg.svd and they work as >> expected. >> > >> >> Would it make sense to embed the blas and lapack header files as part >> of this numpy wheel and make numpy.distutils.system_info return the >> lib and include folder pointing to the embedded libopenblas.dll and >> header files so has to make third party libraries directly buildable >> against those? >> > > as for using openblas by default in binary builds, no. > pthread openblas build is now fork safe which is great but it is still > not reliable enough for a default. > E.g. the current latest release 0.2.8 still has one crash bug on > dgemv[1], and wrong results zherk/zer2[2] and dgemv/cgemv[3]. > git head has the former four fixed bug still has wrong results for cgemv. I noticed the Carl was only getting three test failures on scipy - are these related? ====================================================================== FAIL: test_decomp.test_eigh('general ', 6, 'F', True, False, False, (2, 4)) ---------------------------------------------------------------------- Traceback (most recent call last): File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest self.test(*self.arg) File "D:\devel\py27\lib\site-packages\scipy\linalg\tests\test_decomp.py", line 642, in eigenhproblem_general assert_array_almost_equal(diag2_, ones(diag2_.shape[0]), DIGITS[dtype]) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 811, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 4 decimals (mismatch 100.0%) x: array([ 0., 0., 0.], dtype=float32) y: array([ 1., 1., 1.]) ====================================================================== FAIL: Tests for the minimize wrapper. ---------------------------------------------------------------------- Traceback (most recent call last): File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest self.test(*self.arg) File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py", line 435, in test_minimize self.test_powell(True) File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py", line 209, in test_powell atol=1e-14, rtol=1e-7) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 1181, in assert_allclose verbose=verbose, header=header) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-14 (mismatch 100.0%) x: array([[ 0.75077639, -0.44156936, 0.47100962], [ 0.75077639, -0.44156936, 0.48052496], [ 1.50155279, -0.88313872, 0.95153458],... y: array([[ 0.72949016, -0.44156936, 0.47100962], [ 0.72949016, -0.44156936, 0.48052496], [ 1.45898031, -0.88313872, 0.95153458],... ====================================================================== FAIL: Powell (direction set) optimization routine ---------------------------------------------------------------------- Traceback (most recent call last): File "D:\devel\py27\lib\site-packages\nose\case.py", line 197, in runTest self.test(*self.arg) File "D:\devel\py27\lib\site-packages\scipy\optimize\tests\test_optimize.py", line 209, in test_powell atol=1e-14, rtol=1e-7) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 1181, in assert_allclose verbose=verbose, header=header) File "D:\devel\py27\lib\site-packages\numpy\testing\utils.py", line 644, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=1e-07, atol=1e-14 (mismatch 100.0%) x: array([[ 0.75077639, -0.44156936, 0.47100962], [ 0.75077639, -0.44156936, 0.48052496], [ 1.50155279, -0.88313872, 0.95153458],... y: array([[ 0.72949016, -0.44156936, 0.47100962], [ 0.72949016, -0.44156936, 0.48052496], [ 1.45898031, -0.88313872, 0.95153458],... ---------------------------------------------------------------------- Ran 8940 tests in 143.892s > Openblas is great if you do not have the patience to build ATLAS and > only use a restricted set of functionality and platforms you can easily > test. I don't think it's possible to build ATLAS on Windows 64-bit at the moment, and it would take a lot of work to make it build, and Clint W has said he does not want to invest much time maintaining the Windows build, so unless something changes, I think ATLAS is not a viable option - for 64 bits at least. Cheers, Matthew From njs at pobox.com Mon Mar 31 22:19:15 2014 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 1 Apr 2014 03:19:15 +0100 Subject: [Numpy-discussion] Dates and times and Datetime64 (again) In-Reply-To: References: <79374FB2-205D-4B76-ADB2-F9895D3A2DF4@uwaterloo.ca> <614DCCFD-BFFC-496D-B721-A08F68FFD6D2@uwaterloo.ca> Message-ID: On 31 Mar 2014 19:47, "Chris Barker" wrote: > > On Sat, Mar 29, 2014 at 3:08 PM, Nathaniel Smith wrote: >> >> On 29 Mar 2014 20:57, "Chris Barker" wrote: >> > I think this is somewhat open for discussion -- yes, it's odd, but in the spirit of practicality beats purity, it seems OK. We could allow any TZ specifier for that matter -- that's kind of how "naive" or "local" timezone (non) handling works -- it's up to the user to make sure that all DTs are in the same timezone. >> >> That isn't how naive timezone handling works in datetime.datetime, though. If you try to mix a timezone (even a Zulu timezone) datetime with a naive datetime, you get an exception. > > fari enough. > > The difference is that datetime.datetime doesn't provide any iso string parsing. Sure it does. datetime.strptime, with the %z modifier in particular. > The use case I'm imagining is for folks with ISO strings with a Z on the end -- they'll need to deal with pre-parsing the strings to strip off the Z, when it wouldn't change the result. > > Maybe this is an argument for "UTC always" rather than "naive"? Probably it is, but that approach seems a lot harder to extend to proper tz support later, plus being more likely to cause trouble for pandas's proper tz support now. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: