From sebastian at sipsolutions.net Sun Nov 1 10:53:26 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 1 Nov 2015 15:53:26 +0000 Subject: [Numpy-discussion] Commit rights for Jonathan J. Helmus In-Reply-To: References: <563187E7.10801@gmail.com> <56338AA4.5080308@gmail.com> Message-ID: Congrats, both of you ;). On Sun Nov 1 04:30:27 2015 GMT+0330, Jaime Fern?ndez del R?o wrote: > "Gruetzi!", as I just found out we say in Switzerland... > On Oct 30, 2015 8:20 AM, "Jonathan Helmus" wrote: > > > On 10/28/2015 09:43 PM, Allan Haldane wrote: > > > On 10/28/2015 05:27 PM, Nathaniel Smith wrote: > > >> Hi all, > > >> > > >> Jonathan J. Helmus (@jjhelmus) has been given commit rights -- let's all > > >> welcome him aboard. > > >> > > >> -n > > > > > > Welcome Jonathan, happy to have you on the team! > > > > > > Allan > > > > > > > Thanks you everyone for the kind welcome. I'm looking forwarding to > > being part of them team. > > > > - Jonathan Helmus > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > From ralf.gommers at gmail.com Sun Nov 1 18:16:27 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 2 Nov 2015 00:16:27 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: On Sun, Nov 1, 2015 at 1:59 AM, Ralf Gommers wrote: > > > On Sun, Nov 1, 2015 at 1:54 AM, Ralf Gommers > wrote: > >> >> >> >> On Thu, Oct 29, 2015 at 8:11 PM, Warren Weckesser < >> warren.weckesser at gmail.com> wrote: >> >>> >>> >>> On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith wrote: >>> >>>> Hi all, >>>> >>>> Apparently it is not well known that if you have a Python project >>>> source tree (e.g., a numpy checkout), then the correct way to install >>>> it is NOT to type >>>> >>>> python setup.py install # bad and broken! >>>> >>>> but rather to type >>>> >>>> pip install . >>>> >>>> >>> >>> FWIW, I don't see any mention of this in the numpy docs, but I do see a >>> lot of instructions involving `setup.py build` and `setup.py install`. >>> See, for example, INSTALL.txt. Also see >>> >>> http://docs.scipy.org/doc/numpy/user/install.html#building-from-source >>> So I guess it is not surprising that it is not well known. >>> >> >> Indeed, install docs are always hopelessly outdated. And we have too many >> of them. There's duplicate info in INSTALL.txt and >> http://scipy.org/scipylib/building/index.html for example. We should >> probably just empty out INSTALL.txt and simply put a link in it to the html >> docs. >> >> I've created an issue with a long todo list and a bunch of links: >> https://github.com/numpy/numpy/issues/6599. Feel free to add stuff. Or >> to go fix something:) >> > > Oh, and: looking at this thread there haven't been serious unanswered > concerns (at least in my perception), so without more discussion I'd > interpret the current status as "go ahead". > Hmm, after some more testing I'm going to have to bring up a few concerns myself: 1. ``pip install .`` still has a clear bug; it starts by copying everything (including .git/ !) to a tempdir with shutil, which is very slow. And the fix for that will go via ``setup.py sdist``, which is still slow. 2. ``pip install .`` silences build output, which may make sense for some usecases, but for numpy it just sits there for minutes with no output after printing "Running setup.py install for numpy". Users will think it hangs and Ctrl-C it. https://github.com/pypa/pip/issues/2732 3. ``pip install .`` refuses to upgrade an already installed development version. For released versions that makes sense, but if I'm in a git tree then I don't want it to refuse because 1.11.0.dev0+githash1 compares equal to 1.11.0.dev0+githash2. Especially after waiting a few minutes, see (1). I've sent a (incomplete) fix for the shutil thing ( https://github.com/pypa/pip/pull/3219) and will comment on some open issues on the pip tracker. But I'm thinking that for now we should go with some printed message first. Something like "please use ``pip install .`` if you want reliable uninstall behavior. See for more details". Pip has worked quite well for me in the past, but the above makes me thing it's not much of an improvement over use of setuptools..... Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sun Nov 1 19:12:33 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Nov 2015 17:12:33 -0700 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: On Sun, Nov 1, 2015 at 4:16 PM, Ralf Gommers wrote: > > > On Sun, Nov 1, 2015 at 1:59 AM, Ralf Gommers > wrote: > >> >> >> On Sun, Nov 1, 2015 at 1:54 AM, Ralf Gommers >> wrote: >> >>> >>> >>> >>> On Thu, Oct 29, 2015 at 8:11 PM, Warren Weckesser < >>> warren.weckesser at gmail.com> wrote: >>> >>>> >>>> >>>> On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Apparently it is not well known that if you have a Python project >>>>> source tree (e.g., a numpy checkout), then the correct way to install >>>>> it is NOT to type >>>>> >>>>> python setup.py install # bad and broken! >>>>> >>>>> but rather to type >>>>> >>>>> pip install . >>>>> >>>>> >>>> >>>> FWIW, I don't see any mention of this in the numpy docs, but I do see a >>>> lot of instructions involving `setup.py build` and `setup.py install`. >>>> See, for example, INSTALL.txt. Also see >>>> >>>> http://docs.scipy.org/doc/numpy/user/install.html#building-from-source >>>> So I guess it is not surprising that it is not well known. >>>> >>> >>> Indeed, install docs are always hopelessly outdated. And we have too >>> many of them. There's duplicate info in INSTALL.txt and >>> http://scipy.org/scipylib/building/index.html for example. We should >>> probably just empty out INSTALL.txt and simply put a link in it to the html >>> docs. >>> >>> I've created an issue with a long todo list and a bunch of links: >>> https://github.com/numpy/numpy/issues/6599. Feel free to add stuff. Or >>> to go fix something:) >>> >> >> Oh, and: looking at this thread there haven't been serious unanswered >> concerns (at least in my perception), so without more discussion I'd >> interpret the current status as "go ahead". >> > > Hmm, after some more testing I'm going to have to bring up a few concerns > myself: > > 1. ``pip install .`` still has a clear bug; it starts by copying > everything (including .git/ !) to a tempdir with shutil, which is very > slow. And the fix for that will go via ``setup.py sdist``, which is still > slow. > > 2. ``pip install .`` silences build output, which may make sense for some > usecases, but for numpy it just sits there for minutes with no output after > printing "Running setup.py install for numpy". Users will think it hangs > and Ctrl-C it. https://github.com/pypa/pip/issues/2732 > > 3. ``pip install .`` refuses to upgrade an already installed development > version. For released versions that makes sense, but if I'm in a git tree > then I don't want it to refuse because 1.11.0.dev0+githash1 compares equal > to 1.11.0.dev0+githash2. Especially after waiting a few minutes, see (1). > > > I've sent a (incomplete) fix for the shutil thing ( > https://github.com/pypa/pip/pull/3219) and will comment on some open > issues on the pip tracker. But I'm thinking that for now we should go with > some printed message first. Something like "please use ``pip install .`` if > you want reliable uninstall behavior. See for more details". > > Pip has worked quite well for me in the past, but the above makes me thing > it's not much of an improvement over use of setuptools..... > Which version of pip? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Mon Nov 2 00:22:01 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Mon, 2 Nov 2015 05:22:01 +0000 (UTC) Subject: [Numpy-discussion] isfortran compatibility in numpy 1.10. References: Message-ID: <1351855729468134001.145125sturla.molden-gmail.com@news.gmane.org> Charles R Harris wrote: > 1. Return `a.flags.f_contiguous`. This differs for 1-D arrays, but is > most consistent with the name isfortran. If the idea is to determine if an array can safely be passed to Fortran, this is the correct one. > 2. Return `a.flags.f_contiguous and a.ndim > 1`, which would be backward > compatible. This one is just wrong. A compromize might be to raise an exception in the case of a.ndim<2. Sturla From ralf.gommers at gmail.com Mon Nov 2 01:47:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 2 Nov 2015 07:47:34 +0100 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: On Mon, Nov 2, 2015 at 1:12 AM, Charles R Harris wrote: > > > On Sun, Nov 1, 2015 at 4:16 PM, Ralf Gommers > wrote: > >> >> >> On Sun, Nov 1, 2015 at 1:59 AM, Ralf Gommers >> wrote: >> >>> >>> >>> On Sun, Nov 1, 2015 at 1:54 AM, Ralf Gommers >>> wrote: >>> >>>> >>>> >>>> >>>> On Thu, Oct 29, 2015 at 8:11 PM, Warren Weckesser < >>>> warren.weckesser at gmail.com> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Oct 27, 2015 at 12:31 AM, Nathaniel Smith >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Apparently it is not well known that if you have a Python project >>>>>> source tree (e.g., a numpy checkout), then the correct way to install >>>>>> it is NOT to type >>>>>> >>>>>> python setup.py install # bad and broken! >>>>>> >>>>>> but rather to type >>>>>> >>>>>> pip install . >>>>>> >>>>>> >>>>> >>>>> FWIW, I don't see any mention of this in the numpy docs, but I do see >>>>> a lot of instructions involving `setup.py build` and `setup.py install`. >>>>> See, for example, INSTALL.txt. Also see >>>>> >>>>> http://docs.scipy.org/doc/numpy/user/install.html#building-from-source >>>>> So I guess it is not surprising that it is not well known. >>>>> >>>> >>>> Indeed, install docs are always hopelessly outdated. And we have too >>>> many of them. There's duplicate info in INSTALL.txt and >>>> http://scipy.org/scipylib/building/index.html for example. We should >>>> probably just empty out INSTALL.txt and simply put a link in it to the html >>>> docs. >>>> >>>> I've created an issue with a long todo list and a bunch of links: >>>> https://github.com/numpy/numpy/issues/6599. Feel free to add stuff. Or >>>> to go fix something:) >>>> >>> >>> Oh, and: looking at this thread there haven't been serious unanswered >>> concerns (at least in my perception), so without more discussion I'd >>> interpret the current status as "go ahead". >>> >> >> Hmm, after some more testing I'm going to have to bring up a few concerns >> myself: >> >> 1. ``pip install .`` still has a clear bug; it starts by copying >> everything (including .git/ !) to a tempdir with shutil, which is very >> slow. And the fix for that will go via ``setup.py sdist``, which is still >> slow. >> >> 2. ``pip install .`` silences build output, which may make sense for some >> usecases, but for numpy it just sits there for minutes with no output after >> printing "Running setup.py install for numpy". Users will think it hangs >> and Ctrl-C it. https://github.com/pypa/pip/issues/2732 >> >> 3. ``pip install .`` refuses to upgrade an already installed development >> version. For released versions that makes sense, but if I'm in a git tree >> then I don't want it to refuse because 1.11.0.dev0+githash1 compares equal >> to 1.11.0.dev0+githash2. Especially after waiting a few minutes, see (1). >> >> >> I've sent a (incomplete) fix for the shutil thing ( >> https://github.com/pypa/pip/pull/3219) and will comment on some open >> issues on the pip tracker. But I'm thinking that for now we should go with >> some printed message first. Something like "please use ``pip install .`` if >> you want reliable uninstall behavior. See for more details". >> >> Pip has worked quite well for me in the past, but the above makes me >> thing it's not much of an improvement over use of setuptools..... >> > > Which version of pip? > Latest master (it's 'develop' branch). Recent released versions will be the same, because there are open issues for these things. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Mon Nov 2 05:09:05 2015 From: faltet at gmail.com (Francesc Alted) Date: Mon, 2 Nov 2015 11:09:05 +0100 Subject: [Numpy-discussion] ANN: numexpr 2.4.5 released Message-ID: ========================= Announcing Numexpr 2.4.5 ========================= Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. What's new ========== This is a maintenance release where an important bug in multithreading code has been fixed (#185 Benedikt Reinartz, Francesc Alted). Also, many harmless warnings (overflow/underflow, divide by zero and others) in the test suite have been silenced (#183, Francesc Alted). In case you want to know more in detail what has changed in this version, see: https://github.com/pydata/numexpr/blob/master/RELEASE_NOTES.rst Where I can find Numexpr? ========================= The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From scollis.acrf at gmail.com Mon Nov 2 11:40:25 2015 From: scollis.acrf at gmail.com (Scott Collis) Date: Mon, 02 Nov 2015 10:40:25 -0600 Subject: [Numpy-discussion] Argonne is hiring a postdoc in radar forward modelling using Python Message-ID: <563791F9.6060807@gmail.com> Dear Numpy Users, Argonne National Lab is hiring a postdoc working with the team behind Py-ART. Please take a look and use this link to apply and direct any questions towards me. http://careers.peopleclick.com/careerscp/client_argonnelab/post_doc/en_US/gateway.do?functionName=viewFromLink&localeCode=en-us&jobPostId=3702&source=Facebook&sourceType=NETWORKING_SITE Long shot I know, but we found our key developer using this list last time :) Cheers, Scott -- -- Dr Scott Collis ARM Precipitation Radar Translator Environmental Science Division Argonne National Laboratory Mb: +1 630 235 8025 Of: +1 630 252 0550 Become a Py-ART user today! http://arm-doe.github.io/pyart/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Nov 2 13:28:23 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 2 Nov 2015 18:28:23 +0000 Subject: [Numpy-discussion] isfortran compatibility in numpy 1.10. In-Reply-To: <1351855729468134001.145125sturla.molden-gmail.com@news.gmane.org> References: <1351855729468134001.145125sturla.molden-gmail.com@news.gmane.org> Message-ID: I bet it has all been said already, but to note just in case. In numpy itself we use it mostly to determine the memory order of the *output* and not for safty purpose. That is the macro of course and I think yelling people to use flags.fnc in python is better. - Sebastian On Mon Nov 2 08:52:01 2015 GMT+0330, Sturla Molden wrote: > Charles R Harris wrote: > > > 1. Return `a.flags.f_contiguous`. This differs for 1-D arrays, but is > > most consistent with the name isfortran. > > If the idea is to determine if an array can safely be passed to Fortran, > this is the correct one. > > > 2. Return `a.flags.f_contiguous and a.ndim > 1`, which would be backward > > compatible. > > This one is just wrong. > > A compromize might be to raise an exception in the case of a.ndim<2. > > Sturla > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > From charlesr.harris at gmail.com Mon Nov 2 13:49:27 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 2 Nov 2015 11:49:27 -0700 Subject: [Numpy-discussion] isfortran compatibility in numpy 1.10. In-Reply-To: References: <1351855729468134001.145125sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Nov 2, 2015 at 11:28 AM, Sebastian Berg wrote: > I bet it has all been said already, but to note just in case. In numpy > itself we use it mostly to determine the memory order of the *output* and > not for safty purpose. That is the macro of course and I think yelling > people to use flags.fnc in python is better. > Probably all the Numpy uses of `PyArray_ISFORTRAN` should be audited. My guess is that it will be found to be incorrect in some (most?) of the places. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Mon Nov 2 14:13:46 2015 From: faltet at gmail.com (Francesc Alted) Date: Mon, 2 Nov 2015 20:13:46 +0100 Subject: [Numpy-discussion] ANN: numexpr 2.4.6 released Message-ID: Hi, This is a quick release fixing some reported problems in the 2.4.5 version that I announced a few hours ago. Hope I have fixed the main issues now. Now, the official announcement: ========================= Announcing Numexpr 2.4.6 ========================= Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-threaded capabilities, as well as support for Intel's MKL (Math Kernel Library), which allows an extremely fast evaluation of transcendental functions (sin, cos, tan, exp, log...) while squeezing the last drop of performance out of your multi-core processors. Look here for a some benchmarks of numexpr using MKL: https://github.com/pydata/numexpr/wiki/NumexprMKL Its only dependency is NumPy (MKL is optional), so it works well as an easy-to-deploy, easy-to-use, computational engine for projects that don't want to adopt other solutions requiring more heavy dependencies. What's new ========== This is a quick maintenance version that offers better handling of MSVC symbols (#168, Francesc Alted), as well as fising some UserWarnings in Solaris (#189, Graham Jones). In case you want to know more in detail what has changed in this version, see: https://github.com/pydata/numexpr/blob/master/RELEASE_NOTES.rst Where I can find Numexpr? ========================= The project is hosted at GitHub in: https://github.com/pydata/numexpr You can get the packages from PyPI as well (but not for RC releases): http://pypi.python.org/pypi/numexpr Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy data! -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Nov 2 18:04:54 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 3 Nov 2015 00:04:54 +0100 Subject: [Numpy-discussion] Numpy style docstring support in Sphinx and PyCharm Message-ID: Hi all, Just noticed this: http://sphinx-doc.org/latest/ext/napoleon.html http://www.jetbrains.com/pycharm/whatsnew/index.html#GDocstrings Slowly conquering the docstring world:) Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Nov 2 18:44:06 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 2 Nov 2015 15:44:06 -0800 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References:

<2283704104052164280@unknownmsgid>

<-1464708838107245522@unknownmsgid> Message-ID: On Tue, Oct 27, 2015 at 7:30 AM, Benjamin Root wrote: > FWIW, when I needed a fast Fixed Width reader > was there potentially no whitespace between fields in that case? In which case, it really isn a different use-case than delimited text -- if it's at all common, a version written in C would be nice and fast. and nat hard to do. But fromstring never would have helped you with that anyway :-) -CHB > for a very large dataset last year, I found that np.genfromtext() was > faster than pandas' read_fwf(). IIRC, pandas' text reading code fell back > to pure python for fixed width scenarios. > > On Fri, Oct 23, 2015 at 8:22 PM, Chris Barker - NOAA Federal < > chris.barker at noaa.gov> wrote: > >> Grabbing the pandas csv reader would be great, and I hope it happens >> sooner than later, though alas, I haven't the spare cycles for it either. >> >> In the meantime though, can we put a deprecation Warning in when using >> fromstring() on text files? It's really pretty broken. >> >> -Chris >> >> On Oct 23, 2015, at 4:02 PM, Jeff Reback wrote: >> >> >> >> On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: >> >> On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: >> > >> > On Oct 23, 2015, at 6:13 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> > >> >> >> >> >> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < >> chris.barker at noaa.gov> wrote: >> >>> >> >>> >> >>>> I think it would be good to keep the usage to read binary data at >> least. >> >>> >> >>> >> >>> Agreed -- it's only the text file reading I'm proposing to deprecate. >> It was kind of weird to cram it in there in the first place. >> >>> >> >>> Oh, fromfile() has the same issues. >> >>> >> >>> Chris >> >>> >> >>> >> >>>> Or is there a good alternative to `np.fromstring(, >> dtype=...)`? -- Marten >> >>>> >> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker >> wrote: >> >>>>> >> >>>>> There was just a question about a bug/issue with scipy.fromstring >> (which is numpy.fromstring) when used to read integers from a text file. >> >>>>> >> >>>>> >> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >> >>>>> >> >>>>> fromstring() is bugging and inflexible for reading text files -- >> and it is a very, very ugly mess of code. I dug into it a while back, and >> gave up -- just to much of a mess! >> >>>>> >> >>>>> So we really should completely re-implement it, or deprecate it. I >> doubt anyone is going to do a big refactor, so that means deprecating it. >> >>>>> >> >>>>> Also -- if we do want a fast read numbers from text files function >> (which would be nice, actually), it really should get a new name anyway. >> >>>>> >> >>>>> (and the hopefully coming new dtype system would make it easier to >> write cleanly) >> >>>>> >> >>>>> I'm not sure what deprecating something means, though -- have it >> raise a deprecation warning in the next version? >> >>>>> >> >> >> >> There was discussion at SciPy 2015 of separating out the text reading >> abilities of Pandas so that numpy could include it. We should contact Jeff >> Rebeck and see about moving that forward. >> > >> > >> > IIRC Thomas Caswell was interested in doing this :) >> >> When he was in Berkeley a few weeks ago he assured me that every night >> since SciPy he has dutifully been feeling guilty about not having done it >> yet. I think this week his paltry excuse is that he's "on his honeymoon" or >> something. >> >> ...which is to say that if someone has some spare cycles to take this >> over then I think that might be a nice wedding present for him :-). >> >> (The basic idea is to take the text reading backend behind >> pandas.read_csv and extract it into a standalone package that pandas could >> depend on, and that could also be used by other packages like numpy (among >> others -- I thing dato's SFrame package has a fork of this code as well?)) >> >> -n >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> I can certainly provide guidance on how/what to extract but don't have >> spare cycles myself for this :( >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Nov 2 18:55:46 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 2 Nov 2015 15:55:46 -0800 Subject: [Numpy-discussion] [NumPy/Swig] Return NumPy array with same size as input array (no additional length argument) In-Reply-To: <1446272109262-41601.post@n7.nabble.com> References: <1446272109262-41601.post@n7.nabble.com> Message-ID: On Fri, Oct 30, 2015 at 11:15 PM, laurentes wrote: > Using Swig, I don't manage to (properly) create the Python Binding for the > following C-like function: > > void add_array(double* input_array1, double* input_array2, double* > output_array, int length); > > where the three arrays have all the same length. > do you have to use SWIG? this would be really easy in Cython.... cdef cdef extern from "your_header.h": void add_array(double* input_array1, double* input_array2, double* output_array, int length) def py_add_array( np.ndarray[double, ndim=1] arr1, np.ndarray[double, ndim=1] arr2): cdef int length if arr1.shape != arr2.shape: raise ValueError("Arrays must be the same size") length = arr1.shape[0] cdef np.ndarray[double, ndim=1] out_arr = np.empty((length), dtype=np.float64) add_array(&arr1[0], &arr2[0], &out_arr[0], length) return out_arr Untested and from memory -- but you get the idea. -CHB > > > > This is similar to this thread > > < > http://numpy-discussion.10968.n7.nabble.com/Numpy-SWIG-td25709.html#a25710 > > > > , which has never been fully addressed online. > > > > From Python, I would like to be able to call: > > > add_array(input_array1, input_array2) > > which would return me a newly allocated NumPy array (output_array) with the > result. > > In my Swig file, I've first used the wrapper function trick described here > < > http://web.mit.edu/6.863/spring2011/packages/numpy_src/doc/swig/doc/numpy_swig.html#a-common-example > > > , that is: > > %apply (double* IN_ARRAY1, int DIM1) {(double* input_array1, int length1), > (double* input_array2, int length2)}; > %apply (double* ARGOUT_ARRAY1, int DIM1) {(double* output_array, int > length3)}; > > %rename (add_array) my_add_array; > %exception my_add_array { > $action > if (PyErr_Occurred()) SWIG_fail; > } > %inline %{ > void my_add_array(double* input_array1, int length1, double* input_array2, > int length2, double* output_array, int length3) { > if (length1 != length2 || length1 != length3) { > PyErr_Format(PyExc_ValueError, > "Arrays of lengths (%d,%d,%d) given", > length1, length2, length3); > } > else { > add_array(input_array1, input_array2, output_array, length1); > } > } > %} > > This allows me to call the function from Python using > add_array(input_array1, input_array2, length). But the third argument of > this function is useless and this function does not look 'Pythonic'. > > Could someone help me to modify my Swig file, such that only the first two > arguments are required for the Python API? > > Thanks a lot, > Laurent > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/NumPy-Swig-Return-NumPy-array-with-same-size-as-input-array-no-additional-length-argument-tp41601.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Nov 2 19:04:17 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 2 Nov 2015 16:04:17 -0800 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References: <562F80A5.6060004@gmail.com> Message-ID: On Tue, Oct 27, 2015 at 8:25 AM, Nathan Goldbaum wrote: > Interestingly, conda actually does "setup.py install" in the recipe for > numpy: > indeed -- many, many conda packages do setup.py install, whihc doesn't mean it's a good idea --personally, I'm trying hard to switch them all to: pip install ./ :-) Which reminds me, the conda skelaton command craes a pip install build.sh file -- I really need to submit a PR for that ... There are two cases where a 'pip install' run might go off and start >> downloading packages without asking you: >> > for my part, regular old setup.py isntall oftem goes off and istalls sutff too - using easy_install, which really sucks... This is making me want a setuptools-lite again -- see the distutils SIG if you're curious. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Nov 2 20:57:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Nov 2015 17:57:35 -0800 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: [Adding distutils-sig to the CC as a heads-up. The context is that numpy is looking at deprecating the use of 'python setup.py install' and enforcing the use of 'pip install .' instead, and running into some issues that will probably need to be addressed if 'pip install .' is going to become the standard interface to work with source trees.] On Sun, Nov 1, 2015 at 3:16 PM, Ralf Gommers wrote: [...] > Hmm, after some more testing I'm going to have to bring up a few concerns > myself: > > 1. ``pip install .`` still has a clear bug; it starts by copying everything > (including .git/ !) to a tempdir with shutil, which is very slow. And the > fix for that will go via ``setup.py sdist``, which is still slow. Ugh. If 'pip (install/wheel) .' is supposed to become the standard way to build things, then it should probably build in-place by default. Working in a temp dir makes perfect sense for 'pip install ' or 'pip install ', but if the user supplies an actual named on-disk directory then presumably the user is expecting this directory to be used, and to be able to take advantage of incremental rebuilds etc., no? > 2. ``pip install .`` silences build output, which may make sense for some > usecases, but for numpy it just sits there for minutes with no output after > printing "Running setup.py install for numpy". Users will think it hangs and > Ctrl-C it. https://github.com/pypa/pip/issues/2732 I tend to agree with the commentary there that for end users this is different but no worse than the current situation where we spit out pages of "errors" that don't mean anything :-). I posted a suggestion on that bug that might help with the apparent hanging problem. > 3. ``pip install .`` refuses to upgrade an already installed development > version. For released versions that makes sense, but if I'm in a git tree > then I don't want it to refuse because 1.11.0.dev0+githash1 compares equal > to 1.11.0.dev0+githash2. Especially after waiting a few minutes, see (1). Ugh, this is clearly just a bug -- `pip install .` should always unconditionally install, IMO. (Did you file a bug yet?) At least the workaround is just 'pip uninstall numpy; pip install .', which is still better the running 'setup.py install' and having it blithely overwrite some files and not others. The first and last issue seem like ones that will mostly only affect developers, who should mostly have the ability to deal with these weird issues (or just use setup.py install --force if that's what they prefer)? This still seems like a reasonable trade-off to me if it also has the effect of reducing the number of weird broken installs among our thousands-of-times-larger userbase. -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Mon Nov 2 22:02:30 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Nov 2015 19:02:30 -0800 Subject: [Numpy-discussion] [Distutils] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: On Nov 2, 2015 6:51 PM, "Robert Collins" wrote: > > On 3 November 2015 at 14:57, Nathaniel Smith wrote: > > [Adding distutils-sig to the CC as a heads-up. The context is that > > numpy is looking at deprecating the use of 'python setup.py install' > > and enforcing the use of 'pip install .' instead, and running into > > some issues that will probably need to be addressed if 'pip install .' > > is going to become the standard interface to work with source trees.] > > > > On Sun, Nov 1, 2015 at 3:16 PM, Ralf Gommers wrote: > > [...] > >> Hmm, after some more testing I'm going to have to bring up a few concerns > >> myself: > >> > >> 1. ``pip install .`` still has a clear bug; it starts by copying everything > >> (including .git/ !) to a tempdir with shutil, which is very slow. And the > >> fix for that will go via ``setup.py sdist``, which is still slow. > > > > Ugh. If 'pip (install/wheel) .' is supposed to become the standard way > > to build things, then it should probably build in-place by default. > > Working in a temp dir makes perfect sense for 'pip install > > ' or 'pip install ', but if the user supplies an > > actual named on-disk directory then presumably the user is expecting > > this directory to be used, and to be able to take advantage of > > incremental rebuilds etc., no? > > Thats what 'pip install -e .' does. 'setup.py develop' -> 'pip install -e .' I'm not talking about in place installs, I'm talking about e.g. building a wheel and then tweaking one file and rebuilding -- traditionally build systems go to some effort to keep track of intermediate artifacts and reuse them across builds when possible, but if you always copy the source tree into a temporary directory before building then there's not much the build system can do. > >> 3. ``pip install .`` refuses to upgrade an already installed development > >> version. For released versions that makes sense, but if I'm in a git tree > >> then I don't want it to refuse because 1.11.0.dev0+githash1 compares equal > >> to 1.11.0.dev0+githash2. Especially after waiting a few minutes, see (1). > > > > Ugh, this is clearly just a bug -- `pip install .` should always > > unconditionally install, IMO. (Did you file a bug yet?) At least the > > workaround is just 'pip uninstall numpy; pip install .', which is > > still better the running 'setup.py install' and having it blithely > > overwrite some files and not others. > > There is a bug open. https://github.com/pypa/pip/issues/536 Thanks! -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From lzkelley at gmail.com Tue Nov 3 09:40:31 2015 From: lzkelley at gmail.com (Luke Zoltan Kelley) Date: Tue, 3 Nov 2015 09:40:31 -0500 Subject: [Numpy-discussion] histogram gives meaningless results with non-finite range Message-ID: <33129B70-E5C4-4F7E-A953-E47D4391690E@gmail.com> This came up in [a matplotlib issue](https://github.com/matplotlib/matplotlib/issues/5221): >>> np.histogram(np.arange(10), range=(0.0, np.inf)) (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), array([ nan, inf, inf, inf, inf, inf, inf, inf, inf, inf, inf])) >>> np.histogram(np.arange(10), range=(0.0, np.nan)) (array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), array([ nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])) Clearly the behavior is undefined for those arguments, but perhaps there should be an assertion that the given range must be finite? Happy to make a PR for this. Luke -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Tue Nov 3 09:59:59 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 3 Nov 2015 09:59:59 -0500 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References:

<2283704104052164280@unknownmsgid>

<-1464708838107245522@unknownmsgid>

Message-ID: Correct, there were entries that would sometimes take up their entire width. The delimited text readers could not read this particular dataset. The dataset I am referring to is the processed ISD data: https://www.ncdc.noaa.gov/isd As for fromstring() not being able to help there, I didn't mean to imply that it would. I was more aiming to point out a situation where the NumPy's text file reader was significantly better than the Pandas version, so we would want to make sure that we properly benchmark any significant changes to NumPy's text reading code. Who knows where else NumPy beats Pandas? Ben On Mon, Nov 2, 2015 at 6:44 PM, Chris Barker wrote: > On Tue, Oct 27, 2015 at 7:30 AM, Benjamin Root > wrote: > >> FWIW, when I needed a fast Fixed Width reader >> > > was there potentially no whitespace between fields in that case? In which > case, it really isn a different use-case than delimited text -- if it's at > all common, a version written in C would be nice and fast. and nat hard to > do. > > But fromstring never would have helped you with that anyway :-) > > -CHB > > > >> for a very large dataset last year, I found that np.genfromtext() was >> faster than pandas' read_fwf(). IIRC, pandas' text reading code fell back >> to pure python for fixed width scenarios. >> >> On Fri, Oct 23, 2015 at 8:22 PM, Chris Barker - NOAA Federal < >> chris.barker at noaa.gov> wrote: >> >>> Grabbing the pandas csv reader would be great, and I hope it happens >>> sooner than later, though alas, I haven't the spare cycles for it either. >>> >>> In the meantime though, can we put a deprecation Warning in when using >>> fromstring() on text files? It's really pretty broken. >>> >>> -Chris >>> >>> On Oct 23, 2015, at 4:02 PM, Jeff Reback wrote: >>> >>> >>> >>> On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: >>> >>> On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: >>> > >>> > On Oct 23, 2015, at 6:13 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> > >>> >> >>> >> >>> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < >>> chris.barker at noaa.gov> wrote: >>> >>> >>> >>> >>> >>>> I think it would be good to keep the usage to read binary data at >>> least. >>> >>> >>> >>> >>> >>> Agreed -- it's only the text file reading I'm proposing to >>> deprecate. It was kind of weird to cram it in there in the first place. >>> >>> >>> >>> Oh, fromfile() has the same issues. >>> >>> >>> >>> Chris >>> >>> >>> >>> >>> >>>> Or is there a good alternative to `np.fromstring(, >>> dtype=...)`? -- Marten >>> >>>> >>> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker < >>> chris.barker at noaa.gov> wrote: >>> >>>>> >>> >>>>> There was just a question about a bug/issue with scipy.fromstring >>> (which is numpy.fromstring) when used to read integers from a text file. >>> >>>>> >>> >>>>> >>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >>> >>>>> >>> >>>>> fromstring() is bugging and inflexible for reading text files -- >>> and it is a very, very ugly mess of code. I dug into it a while back, and >>> gave up -- just to much of a mess! >>> >>>>> >>> >>>>> So we really should completely re-implement it, or deprecate it. I >>> doubt anyone is going to do a big refactor, so that means deprecating it. >>> >>>>> >>> >>>>> Also -- if we do want a fast read numbers from text files function >>> (which would be nice, actually), it really should get a new name anyway. >>> >>>>> >>> >>>>> (and the hopefully coming new dtype system would make it easier to >>> write cleanly) >>> >>>>> >>> >>>>> I'm not sure what deprecating something means, though -- have it >>> raise a deprecation warning in the next version? >>> >>>>> >>> >> >>> >> There was discussion at SciPy 2015 of separating out the text reading >>> abilities of Pandas so that numpy could include it. We should contact Jeff >>> Rebeck and see about moving that forward. >>> > >>> > >>> > IIRC Thomas Caswell was interested in doing this :) >>> >>> When he was in Berkeley a few weeks ago he assured me that every night >>> since SciPy he has dutifully been feeling guilty about not having done it >>> yet. I think this week his paltry excuse is that he's "on his honeymoon" or >>> something. >>> >>> ...which is to say that if someone has some spare cycles to take this >>> over then I think that might be a nice wedding present for him :-). >>> >>> (The basic idea is to take the text reading backend behind >>> pandas.read_csv and extract it into a standalone package that pandas could >>> depend on, and that could also be used by other packages like numpy (among >>> others -- I thing dato's SFrame package has a fork of this code as well?)) >>> >>> -n >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> I can certainly provide guidance on how/what to extract but don't have >>> spare cycles myself for this :( >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Nov 3 12:03:01 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 3 Nov 2015 09:03:01 -0800 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: References:

<2283704104052164280@unknownmsgid>

<-1464708838107245522@unknownmsgid>

Message-ID: <2168677754684929763@unknownmsgid> I was more aiming to point out a situation where the NumPy's text file reader was significantly better than the Pandas version, so we would want to make sure that we properly benchmark any significant changes to NumPy's text reading code. Who knows where else NumPy beats Pandas? Indeed. For this example, I think a fixed-with reader really is a different animal, and it's probably a good idea to have a high performance one in Numpy. Among other things, you wouldn't want it to try to auto-determine data types or anything like that. I think what's on the table now is to bring in a new delimited reader -- I.e. CSV in its various flavors. CHB Ben On Mon, Nov 2, 2015 at 6:44 PM, Chris Barker wrote: > On Tue, Oct 27, 2015 at 7:30 AM, Benjamin Root > wrote: > >> FWIW, when I needed a fast Fixed Width reader >> > > was there potentially no whitespace between fields in that case? In which > case, it really isn a different use-case than delimited text -- if it's at > all common, a version written in C would be nice and fast. and nat hard to > do. > > But fromstring never would have helped you with that anyway :-) > > -CHB > > > >> for a very large dataset last year, I found that np.genfromtext() was >> faster than pandas' read_fwf(). IIRC, pandas' text reading code fell back >> to pure python for fixed width scenarios. >> >> On Fri, Oct 23, 2015 at 8:22 PM, Chris Barker - NOAA Federal < >> chris.barker at noaa.gov> wrote: >> >>> Grabbing the pandas csv reader would be great, and I hope it happens >>> sooner than later, though alas, I haven't the spare cycles for it either. >>> >>> In the meantime though, can we put a deprecation Warning in when using >>> fromstring() on text files? It's really pretty broken. >>> >>> -Chris >>> >>> On Oct 23, 2015, at 4:02 PM, Jeff Reback wrote: >>> >>> >>> >>> On Oct 23, 2015, at 6:49 PM, Nathaniel Smith wrote: >>> >>> On Oct 23, 2015 3:30 PM, "Jeff Reback" wrote: >>> > >>> > On Oct 23, 2015, at 6:13 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> > >>> >> >>> >> >>> >> On Thu, Oct 22, 2015 at 5:47 PM, Chris Barker - NOAA Federal < >>> chris.barker at noaa.gov> wrote: >>> >>> >>> >>> >>> >>>> I think it would be good to keep the usage to read binary data at >>> least. >>> >>> >>> >>> >>> >>> Agreed -- it's only the text file reading I'm proposing to >>> deprecate. It was kind of weird to cram it in there in the first place. >>> >>> >>> >>> Oh, fromfile() has the same issues. >>> >>> >>> >>> Chris >>> >>> >>> >>> >>> >>>> Or is there a good alternative to `np.fromstring(, >>> dtype=...)`? -- Marten >>> >>>> >>> >>>> On Thu, Oct 22, 2015 at 1:03 PM, Chris Barker < >>> chris.barker at noaa.gov> wrote: >>> >>>>> >>> >>>>> There was just a question about a bug/issue with scipy.fromstring >>> (which is numpy.fromstring) when used to read integers from a text file. >>> >>>>> >>> >>>>> >>> https://mail.scipy.org/pipermail/scipy-user/2015-October/036746.html >>> >>>>> >>> >>>>> fromstring() is bugging and inflexible for reading text files -- >>> and it is a very, very ugly mess of code. I dug into it a while back, and >>> gave up -- just to much of a mess! >>> >>>>> >>> >>>>> So we really should completely re-implement it, or deprecate it. I >>> doubt anyone is going to do a big refactor, so that means deprecating it. >>> >>>>> >>> >>>>> Also -- if we do want a fast read numbers from text files function >>> (which would be nice, actually), it really should get a new name anyway. >>> >>>>> >>> >>>>> (and the hopefully coming new dtype system would make it easier to >>> write cleanly) >>> >>>>> >>> >>>>> I'm not sure what deprecating something means, though -- have it >>> raise a deprecation warning in the next version? >>> >>>>> >>> >> >>> >> There was discussion at SciPy 2015 of separating out the text reading >>> abilities of Pandas so that numpy could include it. We should contact Jeff >>> Rebeck and see about moving that forward. >>> > >>> > >>> > IIRC Thomas Caswell was interested in doing this :) >>> >>> When he was in Berkeley a few weeks ago he assured me that every night >>> since SciPy he has dutifully been feeling guilty about not having done it >>> yet. I think this week his paltry excuse is that he's "on his honeymoon" or >>> something. >>> >>> ...which is to say that if someone has some spare cycles to take this >>> over then I think that might be a nice wedding present for him :-). >>> >>> (The basic idea is to take the text reading backend behind >>> pandas.read_csv and extract it into a standalone package that pandas could >>> depend on, and that could also be used by other packages like numpy (among >>> others -- I thing dato's SFrame package has a fork of this code as well?)) >>> >>> -n >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> I can certainly provide guidance on how/what to extract but don't have >>> spare cycles myself for this :( >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Tue Nov 3 12:10:16 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Tue, 3 Nov 2015 09:10:16 -0800 Subject: [Numpy-discussion] [Distutils] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: <561702487663958680@unknownmsgid> >> I'm not talking about in place installs, I'm talking about e.g. building a >> wheel and then tweaking one file and rebuilding -- traditionally build >> systems go to some effort to keep track of intermediate artifacts and reuse >> them across builds when possible, but if you always copy the source tree >> into a temporary directory before building then there's not much the build >> system can do. This strikes me as an optimization -- is it an important one? If I'm doing a lot of tweaking and re-running, I'm usually in develop mode. I can see that when you build a wheel, you may build it, test it, discover an wheel-specific error, and then need to repeat the cycle -- but is that a major use-case? That being said, I have been pretty frustrated debugging conda-build scripts -- there is a lot of overhead setting up the build environment each time you do a build... But with wheel building there is much less overhead, and far fewer complications requiring the edit-build cycle. And couldn't make-style this-has-already-been-done checking happen with a copy anyway? CHB > Ah yes. So I don't think pip should do what it does. It a violation of > the abstractions we all want to see within it. However its not me you > need to convince ;). > > -Rob > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > _______________________________________________ > Distutils-SIG maillist - Distutils-SIG at python.org > https://mail.python.org/mailman/listinfo/distutils-sig From charlesr.harris at gmail.com Wed Nov 4 14:28:48 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Nov 2015 12:28:48 -0700 Subject: [Numpy-discussion] New behavior of allclose Message-ID: Hi All, This is to open a discussion of a change of behavior of `np.allclose`. That function uses `isclose` in numpy 1.10 with the result that array subtypes are preserved whereas before they were not. In particular, memmaps are returned when at least one of the inputs is a memmap. By and large I think this is a good thing, OTOH, it is a change in behavior. It is easy to fix, just run `np.array(result, copy=False)` on the current `result`, but I thought I'd raise the topic on the list in case there is a good argument to change things. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Wed Nov 4 14:36:01 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Wed, 4 Nov 2015 13:36:01 -0600 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References: Message-ID: I actually brought this up before 1.10 came out: https://github.com/numpy/numpy/issues/6196 The behavior change brought out a bug in our use of allclose, so while it was annoying in the sense that our test suite started failing in a new way, it was good in that our tests are now more correct. On Wed, Nov 4, 2015 at 1:28 PM, Charles R Harris wrote: > Hi All, > > This is to open a discussion of a change of behavior of `np.allclose`. > That function uses `isclose` in numpy 1.10 with the result that array > subtypes are preserved whereas before they were not. In particular, memmaps > are returned when at least one of the inputs is a memmap. By and large I > think this is a good thing, OTOH, it is a change in behavior. It is easy to > fix, just run `np.array(result, copy=False)` on the current `result`, but I > thought I'd raise the topic on the list in case there is a good argument to > change things. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Wed Nov 4 14:40:12 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Wed, 4 Nov 2015 14:40:12 -0500 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References: Message-ID: I am not sure I understand what you mean. Specifically that np.isclose will return a memmap if one of the inputs is a memmap. The result is a brand new array, right? So, what is that result memmapping from? Also, how does this impact np.allclose()? That function returns a scalar True/False, so what is the change in behavior there? By the way, the docs for isclose in 1.10.1 does not mention any behavior changes. Ben Root On Wed, Nov 4, 2015 at 2:28 PM, Charles R Harris wrote: > Hi All, > > This is to open a discussion of a change of behavior of `np.allclose`. > That function uses `isclose` in numpy 1.10 with the result that array > subtypes are preserved whereas before they were not. In particular, memmaps > are returned when at least one of the inputs is a memmap. By and large I > think this is a good thing, OTOH, it is a change in behavior. It is easy to > fix, just run `np.array(result, copy=False)` on the current `result`, but I > thought I'd raise the topic on the list in case there is a good argument to > change things. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Wed Nov 4 14:42:28 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Wed, 4 Nov 2015 13:42:28 -0600 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References:

Message-ID: Oh oops, this is about np.allcose, not np.assert_allclose. Sorry for the noise... On Wed, Nov 4, 2015 at 1:36 PM, Nathan Goldbaum wrote: > I actually brought this up before 1.10 came out: > https://github.com/numpy/numpy/issues/6196 > > The behavior change brought out a bug in our use of allclose, so while it > was annoying in the sense that our test suite started failing in a new way, > it was good in that our tests are now more correct. > > On Wed, Nov 4, 2015 at 1:28 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> This is to open a discussion of a change of behavior of `np.allclose`. >> That function uses `isclose` in numpy 1.10 with the result that array >> subtypes are preserved whereas before they were not. In particular, memmaps >> are returned when at least one of the inputs is a memmap. By and large I >> think this is a good thing, OTOH, it is a change in behavior. It is easy to >> fix, just run `np.array(result, copy=False)` on the current `result`, but I >> thought I'd raise the topic on the list in case there is a good argument to >> change things. >> >> Chuck >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Nov 4 14:43:48 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Nov 2015 12:43:48 -0700 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References:

Message-ID: On Wed, Nov 4, 2015 at 12:40 PM, Benjamin Root wrote: > I am not sure I understand what you mean. Specifically that np.isclose > will return a memmap if one of the inputs is a memmap. The result is a > brand new array, right? So, what is that result memmapping from? Also, how > does this impact np.allclose()? That function returns a scalar True/False, > so what is the change in behavior there? > > By the way, the docs for isclose in 1.10.1 does not mention any behavior > changes. > Yep, it is a new issue, see #6475 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Nov 4 14:45:06 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Nov 2015 12:45:06 -0700 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References:

Message-ID: On Wed, Nov 4, 2015 at 12:42 PM, Nathan Goldbaum wrote: > Oh oops, this is about np.allcose, not np.assert_allclose. Sorry for the > noise... > Probably related ;) Did you open an issue for it? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan12343 at gmail.com Wed Nov 4 14:47:43 2015 From: nathan12343 at gmail.com (Nathan Goldbaum) Date: Wed, 4 Nov 2015 13:47:43 -0600 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References:

Message-ID: Yup, https://github.com/numpy/numpy/issues/6196 On Wed, Nov 4, 2015 at 1:45 PM, Charles R Harris wrote: > > > On Wed, Nov 4, 2015 at 12:42 PM, Nathan Goldbaum > wrote: > >> Oh oops, this is about np.allcose, not np.assert_allclose. Sorry for the >> noise... >> > > Probably related ;) Did you open an issue for it? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From derek at astro.physik.uni-goettingen.de Wed Nov 4 15:00:19 2015 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Wed, 4 Nov 2015 21:00:19 +0100 Subject: [Numpy-discussion] deprecate fromstring() for text reading? In-Reply-To: <2168677754684929763@unknownmsgid> References:

<2283704104052164280@unknownmsgid>

<-1464708838107245522@unknownmsgid>

<2168677754684929763@unknownmsgid> Message-ID: <58E03A91-E801-4A53-AC4D-538C9A7DBBC1@astro.physik.uni-goettingen.de> On 3 Nov 2015, at 6:03 pm, Chris Barker - NOAA Federal wrote: > > I was more aiming to point out a situation where the NumPy's text file reader was significantly better than the Pandas version, so we would want to make sure that we properly benchmark any significant changes to NumPy's text reading code. Who knows where else NumPy beats Pandas? > Indeed. For this example, I think a fixed-with reader really is a different animal, and it's probably a good idea to have a high performance one in Numpy. Among other things, you wouldn't want it to try to auto-determine data types or anything like that. > > I think what's on the table now is to bring in a new delimited reader -- I.e. CSV in its various flavors. > To add my own handful of change or at least another data point, I had been looking into both the pandas and the Astropy fast readers as a fast loadtxt/genfromtxt replacement; at the time I found the Astropy cparser source somewhat easier to dig into, although looking now Pandas' parser.pyx seems clear enough as well. Some comparison of the two can be found at http://astropy.readthedocs.org/en/stable/io/ascii/fast_ascii_io.html#speed-gains Unfortunately the Astropy fast reader currently does not support fixed-width format either, and adding this functionality would require modifications to the tokenizer C code - not sure how extensive. Cheers, Derek From stefan at seefeld.name Wed Nov 4 19:40:11 2015 From: stefan at seefeld.name (Stefan Seefeld) Date: Wed, 4 Nov 2015 19:40:11 -0500 Subject: [Numpy-discussion] querying backend information Message-ID: <563AA56B.5020207@seefeld.name> Hello, is there a way to query Numpy for information about backends (BLAS, LAPACK, etc.) that it was compiled against, including compiler / linker flags that were used ? Consider the use-case where instead of calling a function such as numpy.dot() I may want to call the appropriate backend directly using the C API as an optimization technique. Is there a straight-forward way to do that ? In a somewhat related line of thought: Is there a way to see what backends are available during Numpy compile-time ? I'm looking for a list of flags to pick ATLAS/OpenBLAS/LAPACK/MKL or any other backend that might be available, combined with variables (compiler and linker flags, notably) I might have to set. Is that information available at all ? Thanks, Stefan -- ...ich hab' noch einen Koffer in Berlin... From njs at pobox.com Wed Nov 4 23:11:38 2015 From: njs at pobox.com (Nathaniel Smith) Date: Wed, 4 Nov 2015 20:11:38 -0800 Subject: [Numpy-discussion] querying backend information In-Reply-To: <563AA56B.5020207@seefeld.name> References: <563AA56B.5020207@seefeld.name> Message-ID: On Wed, Nov 4, 2015 at 4:40 PM, Stefan Seefeld wrote: > Hello, > > is there a way to query Numpy for information about backends (BLAS, > LAPACK, etc.) that it was compiled against, including compiler / linker > flags that were used ? > Consider the use-case where instead of calling a function such as > numpy.dot() I may want to call the appropriate backend directly using > the C API as an optimization technique. Is there a straight-forward way > to do that ? > > In a somewhat related line of thought: Is there a way to see what > backends are available during Numpy compile-time ? I'm looking for a > list of flags to pick ATLAS/OpenBLAS/LAPACK/MKL or any other backend > that might be available, combined with variables (compiler and linker > flags, notably) I might have to set. Is that information available at all ? NumPy does reveal some information about its configuration and numpy.distutils does provide helper methods, but I'm not super familiar with it so I'll let others answer that part. Regarding the idea of "cutting out the middleman" and calling directly into the appropriate backend via the C API, NumPy doesn't currently expose any interface for doing this. There are some discussions with Antoine from a few months back about this (and given that you work at the same place I'm guessing the motivation is the same? :-)). For some reason I'm failing to find the archives now, but the summary from off the top of my head is: SciPy does expose an interface for this (via cython and its PyCapsule tricks -- see [1]), NumPy is unlikely to because we're wary of adding extra public interfaces and can't guarantee that we even have a full BLAS/LAPACK available (sometimes we fall back on a minimal vendored subset that's just enough for our needs), you probably don't want to try and get into the business of dynamically hunting down BLAS/LAPACK because it will be brittle and expose you to all kinds of cross-platform linker issues, and if you want to pull the clever stuff that scipy is doing out of scipy and put it into its own dedicated blas/lapack package, then well, we need one of those anyway [2]. -n [1] https://github.com/scipy-conference/scipy_proceedings_2015/blob/master/papers/ian_henriksen/cython_blas_lapack_api.rst [2] e.g. https://mail.scipy.org/pipermail/numpy-discussion/2015-January/072123.html -- Nathaniel J. Smith -- http://vorpus.org From ralf.gommers at gmail.com Thu Nov 5 01:37:41 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 5 Nov 2015 07:37:41 +0100 Subject: [Numpy-discussion] querying backend information In-Reply-To: References: <563AA56B.5020207@seefeld.name> Message-ID: On Thu, Nov 5, 2015 at 5:11 AM, Nathaniel Smith wrote: > On Wed, Nov 4, 2015 at 4:40 PM, Stefan Seefeld > wrote: > > Hello, > > > > is there a way to query Numpy for information about backends (BLAS, > > LAPACK, etc.) that it was compiled against, including compiler / linker > > flags that were used ? > > Consider the use-case where instead of calling a function such as > > numpy.dot() I may want to call the appropriate backend directly using > > the C API as an optimization technique. Is there a straight-forward way > > to do that ? > > > > In a somewhat related line of thought: Is there a way to see what > > backends are available during Numpy compile-time ? I'm looking for a > > list of flags to pick ATLAS/OpenBLAS/LAPACK/MKL or any other backend > > that might be available, combined with variables (compiler and linker > > flags, notably) I might have to set. Is that information available at > all ? > > NumPy does reveal some information about its configuration and > numpy.distutils does provide helper methods, but I'm not super > familiar with it so I'll let others answer that part. > np.show_config() Gives: lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base'] define_macros = [('NO_ATLAS_INFO', -1)] language = f77 include_dirs = ['/usr/include/atlas'] openblas_lapack_info: NOT AVAILABLE .... It's a function with no docstring and not in the html docs (I think), so that can certainly be improved. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Thu Nov 5 02:42:15 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Wed, 4 Nov 2015 23:42:15 -0800 Subject: [Numpy-discussion] Proposal for a new function: np.moveaxis Message-ID: I've put up a pull request implementing a new function, np.moveaxis, as an alternative to np.transpose and np.rollaxis: https://github.com/numpy/numpy/pull/6630 This functionality has been discussed (even the exact function name) several times over the years, but it never made it into a pull request. The most pressing issue is that the behavior of np.rollaxis is not intuitive to most users: https://mail.scipy.org/pipermail/numpy-discussion/2010-September/052882.html https://github.com/numpy/numpy/issues/2039 http://stackoverflow.com/questions/29891583/reason-why-numpy-rollaxis-is-so-confusing In this pull request, I also allow the source and destination axes to be sequences as well as scalars. This does not add much complexity to the code, solves some additional use cases and makes np.moveaxis a proper generalization of the other axes manipulation routines (see the pull requests for details). Best of all, it already works on ndarray duck types (like masked array and dask.array), because they have already implemented transpose. I think np.moveaxis would be a useful addition to NumPy -- I've found myself writing helper functions with a subset of its functionality several times over the past few years. What do you think? Cheers, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni.soma at gmail.com Thu Nov 5 03:26:24 2015 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Thu, 05 Nov 2015 00:26:24 -0800 (PST) Subject: [Numpy-discussion] Proposal for a new function: np.moveaxis In-Reply-To: References: Message-ID: <1446711984088.9f2d3d97@Nodemailer> I'm just a lowly user, but I'm a fan of this. +1! On Thu, Nov 5, 2015 at 6:42 PM, Stephan Hoyer wrote: > I've put up a pull request implementing a new function, np.moveaxis, as an > alternative to np.transpose and np.rollaxis: > https://github.com/numpy/numpy/pull/6630 > This functionality has been discussed (even the exact function name) > several times over the years, but it never made it into a pull request. The > most pressing issue is that the behavior of np.rollaxis is not intuitive to > most users: > https://mail.scipy.org/pipermail/numpy-discussion/2010-September/052882.html > https://github.com/numpy/numpy/issues/2039 > http://stackoverflow.com/questions/29891583/reason-why-numpy-rollaxis-is-so-confusing > In this pull request, I also allow the source and destination axes to be > sequences as well as scalars. This does not add much complexity to the > code, solves some additional use cases and makes np.moveaxis a proper > generalization of the other axes manipulation routines (see the pull > requests for details). > Best of all, it already works on ndarray duck types (like masked array and > dask.array), because they have already implemented transpose. > I think np.moveaxis would be a useful addition to NumPy -- I've found > myself writing helper functions with a subset of its functionality several > times over the past few years. What do you think? > Cheers, > Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Thu Nov 5 08:12:36 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Thu, 5 Nov 2015 08:12:36 -0500 Subject: [Numpy-discussion] querying backend information In-Reply-To: References: <563AA56B.5020207@seefeld.name> Message-ID: On Thu, Nov 5, 2015 at 1:37 AM, Ralf Gommers wrote: > > > On Thu, Nov 5, 2015 at 5:11 AM, Nathaniel Smith wrote: > >> On Wed, Nov 4, 2015 at 4:40 PM, Stefan Seefeld >> wrote: >> > Hello, >> > >> > is there a way to query Numpy for information about backends (BLAS, >> > LAPACK, etc.) that it was compiled against, including compiler / linker >> > flags that were used ? >> > Consider the use-case where instead of calling a function such as >> > numpy.dot() I may want to call the appropriate backend directly using >> > the C API as an optimization technique. Is there a straight-forward way >> > to do that ? >> > >> > In a somewhat related line of thought: Is there a way to see what >> > backends are available during Numpy compile-time ? I'm looking for a >> > list of flags to pick ATLAS/OpenBLAS/LAPACK/MKL or any other backend >> > that might be available, combined with variables (compiler and linker >> > flags, notably) I might have to set. Is that information available at >> all ? >> >> NumPy does reveal some information about its configuration and >> numpy.distutils does provide helper methods, but I'm not super >> familiar with it so I'll let others answer that part. >> > > np.show_config() > > Gives: > > lapack_opt_info: > libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] > library_dirs = ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base'] > define_macros = [('NO_ATLAS_INFO', -1)] > language = f77 > include_dirs = ['/usr/include/atlas'] > openblas_lapack_info: > NOT AVAILABLE > .... > > > It's a function with no docstring and not in the html docs (I think), so > that can certainly be improved. > > Ralf > I don't think that show_config is what you want. Those are built time values that aren't necessarily true at run time. For instance, numpy from conda references directories that are not on my machine. Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjhelmus at gmail.com Thu Nov 5 10:18:23 2015 From: jjhelmus at gmail.com (Jonathan Helmus) Date: Thu, 5 Nov 2015 09:18:23 -0600 Subject: [Numpy-discussion] Proposal for a new function: np.moveaxis In-Reply-To: <1446711984088.9f2d3d97@Nodemailer> References: <1446711984088.9f2d3d97@Nodemailer> Message-ID: <563B733F.1090903@gmail.com> Also a +1 from me. I've had to (re-)learn how exactly np.transpose works more times then I care to admit. - Jonathan Helmus On 11/05/2015 02:26 AM, Juan Nunez-Iglesias wrote: > I'm just a lowly user, but I'm a fan of this. +1! > > > > > On Thu, Nov 5, 2015 at 6:42 PM, Stephan Hoyer > wrote: > > I've put up a pull request implementing a new function, > np.moveaxis, as an alternative to np.transpose and np.rollaxis: > https://github.com/numpy/numpy/pull/6630 > > This functionality has been discussed (even the exact function > name) several times over the years, but it never made it into a > pull request. The most pressing issue is that the behavior of > np.rollaxis is not intuitive to most users: > https://mail.scipy.org/pipermail/numpy-discussion/2010-September/052882.html > https://github.com/numpy/numpy/issues/2039 > http://stackoverflow.com/questions/29891583/reason-why-numpy-rollaxis-is-so-confusing > > In this pull request, I also allow the source and destination axes > to be sequences as well as scalars. This does not add much > complexity to the code, solves some additional use cases and makes > np.moveaxis a proper generalization of the other axes manipulation > routines (see the pull requests for details). > > Best of all, it already works on ndarray duck types (like masked > array and dask.array), because they have already implemented > transpose. > > I think np.moveaxis would be a useful addition to NumPy -- I've > found myself writing helper functions with a subset of its > functionality several times over the past few years. What do you > think? > > Cheers, > Stephan > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From hbar1054571 at gmail.com Thu Nov 5 11:26:18 2015 From: hbar1054571 at gmail.com (Johan) Date: Thu, 5 Nov 2015 16:26:18 +0000 (UTC) Subject: [Numpy-discussion] =?utf-8?q?Compilation_problems_npy=5Ffloat64?= Message-ID: Hello, I searched the forum, but couldn't find a post related to my problem. I am installing scipy via pip in cygwin environment pip install scipy Note: numpy version 1.10.1 was installed with pip install -U numpy /usr/bin/gfortran -Wall -g -Wall -g -shared -Wl,-gc-sections -Wl,-s build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/geom2.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/geom.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/global.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/io.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/libqhull.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/mem.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/merge.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/poly2.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/poly.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/qset.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/random.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/rboxlib.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/stat.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/user.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/usermem.o build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/qhull/src/userprintf.o build/temp.cygwin-2.2.1-x86_64- 2.7/scipy/spatial/qhull/src/userprintf_rbox.o -L/usr/lib - L/usr/lib/gcc/x86_64-pc-cygwin/4.9.3 -L/usr/lib/python2.7/config - L/usr/lib -Lbuild/temp.cygwin-2.2.1-x86_64-2.7 -llapack -lblas - lpython2.7 -lgfortran -o build/lib.cygwin-2.2.1-x86_64- 2.7/scipy/spatial/qhull.dll building 'scipy.spatial.ckdtree' extension compiling C++ sources C compiler: g++ -fno-strict-aliasing -ggdb -O2 -pipe -Wimplicit- function-declaration -fdebug-prefix-map=/usr/src/ports/python/python- 2.7.10-1.x86_64/build=/usr/src/debug/python-2.7.10-1 -fdebug-prefix- map=/usr/src/ports/python/python-2.7.10-1.x86_64/src/Python- 2.7.10=/usr/src/debug/python-2.7.10-1 -DNDEBUG -g -fwrapv -O3 -Wall creating build/temp.cygwin-2.2.1-x86_64-2.7/scipy/spatial/ckdtree creating build/temp.cygwin-2.2.1-x86_64- 2.7/scipy/spatial/ckdtree/src compile options: '-I/usr/include/python2.7 - I/usr/lib/python2.7/site-packages/numpy/core/include - Iscipy/spatial/ckdtree/src -I/usr/lib/python2.7/site- packages/numpy/core/include -I/usr/include/python2.7 -c' g++: scipy/spatial/ckdtree/src/ckdtree_cpp_exc.cxx cc1plus: warning: command line option ?-Wimplicit-function- declaration? is valid for C/ObjC but not for C++ g++: scipy/spatial/ckdtree/src/ckdtree_query.cxx cc1plus: warning: command line option ?-Wimplicit-function- declaration? is valid for C/ObjC but not for C++ In file included from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarraytypes.h:1781:0, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarrayobject.h:18, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/arrayobject.h:4, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:15: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it by " \ ^ In file included from scipy/spatial/ckdtree/src/ckdtree_query.cxx:31:0: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h:12:20: error: ?npy_float64 infinity? redeclared as different kind of symbol extern npy_float64 infinity; ^ In file included from /usr/include/python2.7/pyport.h:325:0, from /usr/include/python2.7/Python.h:58, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:14: /usr/include/math.h:263:15: note: previous declaration ?double infinity()? extern double infinity _PARAMS((void)); ^ In file included from scipy/spatial/ckdtree/src/ckdtree_query.cxx:31:0: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h: In function ?npy_float64 _distance_p(const npy_float64*, const npy_float64*, npy_float64, npy_intp, npy_float64)?: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h:139:17: error: invalid operands of types ?const npy_float64 {aka const double}? and ?double()? to binary ?operator==? else if (p==infinity) { ^ scipy/spatial/ckdtree/src/ckdtree_query.cxx: In function ?PyObject* query_knn(const ckdtree*, npy_float64*, npy_intp*, const npy_float64*, npy_intp, npy_intp, npy_float64, npy_float64, npy_float64)?: scipy/spatial/ckdtree/src/ckdtree_query.cxx:431:111: error: cannot convert ?double (*)()? to ?npy_float64 {aka double}? for argument ?9? to ?void __query_single_point(const ckdtree*, npy_float64*, npy_intp*, const npy_float64*, npy_intp, npy_float64, npy_float64, npy_float64, npy_float64)? __query_single_point(self, dd_row, ii_row, xx_row, k, eps, p, distance_upper_bound, ::infinity); ^ In file included from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarrayobject.h:27:0, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/arrayobject.h:4, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:15: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/__multiarray_api.h: At global scope: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/__multiarray_api.h:1634:1: warning: ?int _import_array()? defined but not used [-Wunused-function] _import_array(void) ^ cc1plus: warning: command line option ?-Wimplicit-function- declaration? is valid for C/ObjC but not for C++ In file included from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarraytypes.h:1781:0, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarrayobject.h:18, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/arrayobject.h:4, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:15: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it by " \ ^ In file included from scipy/spatial/ckdtree/src/ckdtree_query.cxx:31:0: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h:12:20: error: ?npy_float64 infinity? redeclared as different kind of symbol extern npy_float64 infinity; ^ In file included from /usr/include/python2.7/pyport.h:325:0, from /usr/include/python2.7/Python.h:58, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:14: /usr/include/math.h:263:15: note: previous declaration ?double infinity()? extern double infinity _PARAMS((void)); ^ In file included from scipy/spatial/ckdtree/src/ckdtree_query.cxx:31:0: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h: In function ?npy_float64 _distance_p(const npy_float64*, const npy_float64*, npy_float64, npy_intp, npy_float64)?: scipy/spatial/ckdtree/src/ckdtree_cpp_methods.h:139:17: error: invalid operands of types ?const npy_float64 {aka const double}? and ?double()? to binary ?operator==? else if (p==infinity) { ^ scipy/spatial/ckdtree/src/ckdtree_query.cxx: In function ?PyObject* query_knn(const ckdtree*, npy_float64*, npy_intp*, const npy_float64*, npy_intp, npy_intp, npy_float64, npy_float64, npy_float64)?: scipy/spatial/ckdtree/src/ckdtree_query.cxx:431:111: error: cannot convert ?double (*)()? to ?npy_float64 {aka double}? for argument ?9? to ?void __query_single_point(const ckdtree*, npy_float64*, npy_intp*, const npy_float64*, npy_intp, npy_float64, npy_float64, npy_float64, npy_float64)? __query_single_point(self, dd_row, ii_row, xx_row, k, eps, p, distance_upper_bound, ::infinity); ^ In file included from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/ndarrayobject.h:27:0, from /usr/lib/python2.7/site- packages/numpy/core/include/numpy/arrayobject.h:4, from scipy/spatial/ckdtree/src/ckdtree_query.cxx:15: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/__multiarray_api.h: At global scope: /usr/lib/python2.7/site- packages/numpy/core/include/numpy/__multiarray_api.h:1634:1: warning: ?int _import_array()? defined but not used [-Wunused-function] _import_array(void) ^ error: Command "g++ -fno-strict-aliasing -ggdb -O2 -pipe -Wimplicit- function-declaration -fdebug-prefix-map=/usr/src/ports/python/python- 2.7.10-1.x86_64/build=/usr/src/debug/python-2.7.10-1 -fdebug-prefix- map=/usr/src/ports/python/python-2.7.10-1.x86_64/src/Python- 2.7.10=/usr/src/debug/python-2.7.10-1 -DNDEBUG -g -fwrapv -O3 -Wall - I/usr/include/python2.7 -I/usr/lib/python2.7/site- packages/numpy/core/include -Iscipy/spatial/ckdtree/src - I/usr/lib/python2.7/site-packages/numpy/core/include - I/usr/include/python2.7 -c scipy/spatial/ckdtree/src/ckdtree_query.cxx - o build/temp.cygwin-2.2.1-x86_64- 2.7/scipy/spatial/ckdtree/src/ckdtree_query.o" failed with exit status 1 ---------------------------------------- Command "/usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build- vAliRx/scipy/setup.py';exec(compile(getattr(tokenize, 'open', open) (__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install -- record /tmp/pip-gxrCbK-record/install-record.txt --single-version- externally-managed --compile" failed with error code 1 in /tmp/pip- build-vAliRx/scipy From pav at iki.fi Thu Nov 5 15:07:51 2015 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 5 Nov 2015 20:07:51 +0000 (UTC) Subject: [Numpy-discussion] Compilation problems npy_float64 References: Message-ID: Thu, 05 Nov 2015 16:26:18 +0000, Johan kirjoitti: > Hello, I searched the forum, but couldn't find a post related to my > problem. I am installing scipy via pip in cygwin environment [clip] > /usr/include/math.h:263:15: note: previous declaration ?double > infinity()? > extern double infinity _PARAMS((void)); > ^ [clip] This looks like some Cygwin weirdness --- a variable called "infinity" is apparently there declared by math.h, and thus a reserved name. This was fixed by (but not for this reason) https://github.com/scipy/scipy/commit/832baa20f0b5 so you may have better luck with the dev version. -- Pauli Virtanen From ralf.gommers at gmail.com Thu Nov 5 16:50:49 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 5 Nov 2015 22:50:49 +0100 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References: Message-ID: On Wed, Nov 4, 2015 at 8:28 PM, Charles R Harris wrote: > Hi All, > > This is to open a discussion of a change of behavior of `np.allclose`. > That function uses `isclose` in numpy 1.10 with the result that array > subtypes are preserved whereas before they were not. In particular, memmaps > are returned when at least one of the inputs is a memmap. By and large I > think this is a good thing, OTOH, it is a change in behavior. It is easy to > fix, just run `np.array(result, copy=False)` on the current `result`, but I > thought I'd raise the topic on the list in case there is a good argument to > change things. > Why would it be good to return a memmap? And am I confused or does your just merged PR [1] revert the behavior you say here is a good thing? Ralf [1] https://github.com/numpy/numpy/pull/6628 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Thu Nov 5 17:00:34 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Thu, 5 Nov 2015 17:00:34 -0500 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References: Message-ID: allclose() needs to return a bool so that one can do "if np.allclose(foo, bar) is True" or some such. The "good behavior" is for np.isclose() to return a memmap, which still confuses the heck out of me, but I am not a memmap expert. On Thu, Nov 5, 2015 at 4:50 PM, Ralf Gommers wrote: > > > On Wed, Nov 4, 2015 at 8:28 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> This is to open a discussion of a change of behavior of `np.allclose`. >> That function uses `isclose` in numpy 1.10 with the result that array >> subtypes are preserved whereas before they were not. In particular, memmaps >> are returned when at least one of the inputs is a memmap. By and large I >> think this is a good thing, OTOH, it is a change in behavior. It is easy to >> fix, just run `np.array(result, copy=False)` on the current `result`, but I >> thought I'd raise the topic on the list in case there is a good argument to >> change things. >> > > Why would it be good to return a memmap? And am I confused or does your > just merged PR [1] revert the behavior you say here is a good thing? > > Ralf > > [1] https://github.com/numpy/numpy/pull/6628 > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Nov 5 17:15:26 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Nov 2015 15:15:26 -0700 Subject: [Numpy-discussion] New behavior of allclose In-Reply-To: References: Message-ID: On Thu, Nov 5, 2015 at 2:50 PM, Ralf Gommers wrote: > > > On Wed, Nov 4, 2015 at 8:28 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> This is to open a discussion of a change of behavior of `np.allclose`. >> That function uses `isclose` in numpy 1.10 with the result that array >> subtypes are preserved whereas before they were not. In particular, memmaps >> are returned when at least one of the inputs is a memmap. By and large I >> think this is a good thing, OTOH, it is a change in behavior. It is easy to >> fix, just run `np.array(result, copy=False)` on the current `result`, but I >> thought I'd raise the topic on the list in case there is a good argument to >> change things. >> > > Why would it be good to return a memmap? And am I confused or does your > just merged PR [1] revert the behavior you say here is a good thing? > Good thing for isclose, not allclose. I was thinking of very large files that might exceed memory in the isclose case, but an argument could be made for other subtypes. Allclose, OTOH, always returns a scalar. I went ahead with boolean for allclose because 1) it is backward compatible, 2) Nathaniel tended in that direction, 3) the conversation here is tending in that direction, 4) I tend in that direction, and finally, I want to get 1.10.2rc1 out this weekend ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Fri Nov 6 12:32:53 2015 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Fri, 06 Nov 2015 09:32:53 -0800 Subject: [Numpy-discussion] Help wanted: implementation of 3D medial axis skeletonization References: <87vb9khwbd.fsf@berkeley.edu> Message-ID: <87oaf7f2yi.fsf@berkeley.edu> Hi all, I have been approached by a group that is interested in sponsoring the development of 3D skeletonization in scikit-image. One potential starting place would be: http://www.insight-journal.org/browse/publication/181 Is anyone interested in working on this? Please get in touch either on the scikit-image mailing list or by mailing me directly. Thanks! St?fan From njs at pobox.com Fri Nov 6 16:56:45 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 6 Nov 2015 13:56:45 -0800 Subject: [Numpy-discussion] Proposal: stop supporting 'setup.py install'; start requiring 'pip install .' instead In-Reply-To: References:

Message-ID: On Mon, Nov 2, 2015 at 5:57 PM, Nathaniel Smith wrote: > On Sun, Nov 1, 2015 at 3:16 PM, Ralf Gommers wrote: >> 2. ``pip install .`` silences build output, which may make sense for some >> usecases, but for numpy it just sits there for minutes with no output after >> printing "Running setup.py install for numpy". Users will think it hangs and >> Ctrl-C it. https://github.com/pypa/pip/issues/2732 > > I tend to agree with the commentary there that for end users this is > different but no worse than the current situation where we spit out > pages of "errors" that don't mean anything :-). I posted a suggestion > on that bug that might help with the apparent hanging problem. For the record, this is now fixed in pip's "develop" branch and should be in the next release. For commands like 'setup.py install', pip now displays a spinner that ticks over whenever the underlying process prints to stdout/stderr. So if the underlying process hangs, then the spinner will stop (it's not just lying to you), but normally it works nicely. https://github.com/pypa/pip/pull/3224 -n -- Nathaniel J. Smith -- http://vorpus.org From pythondev1 at aerojockey.com Sat Nov 7 16:18:22 2015 From: pythondev1 at aerojockey.com (aerojockey) Date: Sat, 7 Nov 2015 14:18:22 -0700 (MST) Subject: [Numpy-discussion] Question about structure arrays Message-ID: <1446931102879-41653.post@n7.nabble.com> Hello, Recently I made some changes to a program I'm working on, and found that the changes made it four times slower than before. After some digging, I found out that one of the new costs was that I added structure arrays. Inside a low-level loop, I create a structure array, populate it Python, then turn it over to some handwritten C code for processing. It turned out that, when passed a structure array as a dtype, numpy has to parse the dtype, which included calls to re.match and eval. Now, this is not a big deal for me to work around by using ordinary slicing and such, and also I can improve things by reusing arrays. Since this is inner loop stuff, sacrificing readability for speed is an appropriate tradeoff. Nevertheless, I was curious if there was a way (or any plans for there to be a way) to compile a struture array dtype. I realize it's not the bread-and-butter of numpy, but it turned out to be a very convenient feature for my use case (populating an array of structures to pass off to C). Thanks -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Question-about-structure-arrays-tp41653.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From njs at pobox.com Sat Nov 7 18:49:22 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sat, 7 Nov 2015 15:49:22 -0800 Subject: [Numpy-discussion] Question about structure arrays In-Reply-To: <1446931102879-41653.post@n7.nabble.com> References: <1446931102879-41653.post@n7.nabble.com> Message-ID: On Sat, Nov 7, 2015 at 1:18 PM, aerojockey wrote: > Hello, > > Recently I made some changes to a program I'm working on, and found that the > changes made it four times slower than before. After some digging, I found > out that one of the new costs was that I added structure arrays. Inside a > low-level loop, I create a structure array, populate it Python, then turn it > over to some handwritten C code for processing. It turned out that, when > passed a structure array as a dtype, numpy has to parse the dtype, which > included calls to re.match and eval. > > Now, this is not a big deal for me to work around by using ordinary slicing > and such, and also I can improve things by reusing arrays. Since this is > inner loop stuff, sacrificing readability for speed is an appropriate > tradeoff. > > Nevertheless, I was curious if there was a way (or any plans for there to be > a way) to compile a struture array dtype. I realize it's not the > bread-and-butter of numpy, but it turned out to be a very convenient feature > for my use case (populating an array of structures to pass off to C). Does it help to turn your dtype string into a dtype object and then pass the dtype object around? E.g. In [1]: dt = np.dtype("i4,i4") In [2]: np.zeros(2, dtype=dt) Out[2]: array([(0, 0), (0, 0)], dtype=[('f0', ' Message-ID: <641995422468645328.080949sturla.molden-gmail.com@news.gmane.org> Johan wrote: > Hello, I searched the forum, but couldn't find a post related to my > problem. I am installing scipy via pip in cygwin environment I think I introduced this error when moving a global variable from the Cython module to a C++ module. The name collision with math.h was silent on Linux, Mac, and Windows (MinGW and MSVC) -- or not even present --, and thus went under the radar. But it eventually showed up on SunOS, and now also on Cygwin. :-( My apologies. Anyhow, it should be gone now. Try SciPy master. Sturla From charlesr.harris at gmail.com Sun Nov 8 20:46:17 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 8 Nov 2015 18:46:17 -0700 Subject: [Numpy-discussion] Feedback on new argument positions for ma.dot and MaskedArray.dot Message-ID: Hi All, I'd like some feedback for the position of the `strict` and `out` arguments for masked arrays. See gh-6653 for the PR in question. Current status without #6652 1. ma.dot(a, b, strict=False) -- established 2. a.dot(b, out=None) -- new in 1.10 Note that 1. requires adding `out` to the end for backward compatibility. OTOH, 2. is new(ish). We can either keep it compatible with ndarray.dot and add `strict` to the end and have it incompatible with 1., or, slightly changing it in 1.10.2, make it compatible with with 1. but incompatible with ndarray. We will face the same sort of problem with adding newer ndarray arguments other existing ma functions that have their own specialized arguments, so having a policy up front will be helpful. My own inclination here is to keep 1. and 2. compatible, and then perhaps at some point following a future warning, make both `strict` and `out` keyword arguments only. Another possiblitly is to make that transition immediate for the method. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sun Nov 8 21:00:25 2015 From: efiring at hawaii.edu (Eric Firing) Date: Sun, 8 Nov 2015 16:00:25 -1000 Subject: [Numpy-discussion] Feedback on new argument positions for ma.dot and MaskedArray.dot In-Reply-To: References: Message-ID: <563FFE39.7060202@hawaii.edu> On 2015/11/08 3:46 PM, Charles R Harris wrote: > Hi All, > > I'd like some feedback for the position of the `strict` and `out` > arguments for masked arrays. See gh-6653 > for the PR in question. > > Current status without #6652 > > 1. ma.dot(a, b, strict=False) -- established > 2. a.dot(b, out=None) -- new in 1.10 > > > Note that 1. requires adding `out` to the end for backward > compatibility. OTOH, 2. is new(ish). We can either keep it compatible > with ndarray.dot and add `strict` to the end and have it incompatible > with 1., or, slightly changing it in 1.10.2, make it compatible with > with 1. but incompatible with ndarray. We will face the same sort of > problem with adding newer ndarray arguments other existing ma functions > that have their own specialized arguments, so having a policy up front > will be helpful. My own inclination here is to keep 1. and 2. > compatible, and then perhaps at some point following a future warning, > make both `strict` and `out` keyword arguments only. Another possiblitly > is to make that transition immediate for the method. I'm not sure about the best sequence, but I like the strategy of moving to keyword-only arguments. It is good for readability, and for flexibility. I also prefer that there be a single convention: either the "out" kwarg is the end of the every signature, or it is the first kwarg in every signature. It's a very special and unusual kwarg, so it should have a standard location. Eric > > Thoughts? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Sun Nov 8 22:43:35 2015 From: njs at pobox.com (Nathaniel Smith) Date: Sun, 8 Nov 2015 19:43:35 -0800 Subject: [Numpy-discussion] Feedback on new argument positions for ma.dot and MaskedArray.dot In-Reply-To: <563FFE39.7060202@hawaii.edu> References: <563FFE39.7060202@hawaii.edu> Message-ID: On Nov 8, 2015 6:00 PM, "Eric Firing" wrote: > > I also prefer that there be a single convention: either the "out" kwarg is the end of the every signature, or it is the first kwarg in every signature. It's a very special and unusual kwarg, so it should have a standard location. For all ufuncs, out arguments come first immediately after in arguments, so +1 for doing that for consistency. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugreports2005 at cs.tut.fi Mon Nov 9 01:11:13 2015 From: bugreports2005 at cs.tut.fi (Lintula) Date: Mon, 9 Nov 2015 08:11:13 +0200 Subject: [Numpy-discussion] Failed numpy.test() with numpy-1.10.1 on RHEL 6 Message-ID: <56403901.50301@cs.tut.fi> Hello, I'm setting up numpy 1.10.1 on RHEL6 (python 2.6.6, atlas-3.8.4, lapack-3.2.1, gcc-4.4.7), and this test fails for me. I notice that someone else has had the same at https://github.com/numpy/numpy/issues/6063 in July. Is this harmless or is it of concern? ====================================================================== FAIL: test_umath.TestComplexFunctions.test_branch_cuts(, [-1, 0.5], [1j, 1j], 1, -1, True) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_umath.py", line 1748, in _check_branch_cut assert_(np.all(np.absolute(y0.imag - yp.imag) < atol), (y0, yp)) File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line 53, in assert_ raise AssertionError(smsg) AssertionError: (array([ 0.00000000e+00+3.14159265j, 1.11022302e-16-1.04719755j]), array([ 4.71216091e-07+3.14159218j, 1.28119737e-13+1.04719755j])) ---------------------------------------------------------------------- Ran 5955 tests in 64.284s FAILED (KNOWNFAIL=3, SKIP=2, failures=1) From irvin.probst at ensta-bretagne.fr Mon Nov 9 04:15:04 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Mon, 9 Nov 2015 10:15:04 +0100 Subject: [Numpy-discussion] loadtxt and usecols Message-ID: <56406418.1010500@ensta-bretagne.fr> Hi, I've recently seen many students, coming from Matlab, struggling against the usecols argument of loadtxt. Most of them tried something like: loadtxt("foo.bar", usecols=2) or the ones with better documentation reading skills tried loadtxt("foo.bar", usecols=(2)) but none of them understood they had to write usecols=[2] or usecols=(2,). Is there a policy in numpy stating that this kind of arguments must be sequences ? I think that being able to an int or a sequence when a single column is needed would make this function a bit more user friendly for beginners. I would gladly submit a PR if noone disagrees. Regards. -- Irvin From ewm at redtetrahedron.org Mon Nov 9 08:24:51 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 9 Nov 2015 08:24:51 -0500 Subject: [Numpy-discussion] Failed numpy.test() with numpy-1.10.1 on RHEL 6 In-Reply-To: <56403901.50301@cs.tut.fi> References: <56403901.50301@cs.tut.fi> Message-ID: This fails because numpy uses the function `cacosh` from the libm and on RHEL6 this function has a bug. As long as you don't care about getting the sign right at the branch cut in this function, then it's harmless. If you do care, the easiest solution will be to install something like anaconda that does not link against the relatively old libm that RHEL6 ships. On Mon, Nov 9, 2015 at 1:11 AM, Lintula wrote: > Hello, > > I'm setting up numpy 1.10.1 on RHEL6 (python 2.6.6, atlas-3.8.4, > lapack-3.2.1, gcc-4.4.7), and this test fails for me. I notice that > someone else has had the same at > https://github.com/numpy/numpy/issues/6063 in July. > > Is this harmless or is it of concern? > > > ====================================================================== > FAIL: test_umath.TestComplexFunctions.test_branch_cuts( 'arccosh'>, [-1, 0.5], [1j, 1j], 1, -1, True) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in > runTest > self.test(*self.arg) > File > "/usr/lib64/python2.6/site-packages/numpy/core/tests/test_umath.py", > line 1748, in _check_branch_cut > assert_(np.all(np.absolute(y0.imag - yp.imag) < atol), (y0, yp)) > File "/usr/lib64/python2.6/site-packages/numpy/testing/utils.py", line > 53, in assert_ > raise AssertionError(smsg) > AssertionError: (array([ 0.00000000e+00+3.14159265j, > 1.11022302e-16-1.04719755j]), array([ 4.71216091e-07+3.14159218j, > 1.28119737e-13+1.04719755j])) > > ---------------------------------------------------------------------- > Ran 5955 tests in 64.284s > > FAILED (KNOWNFAIL=3, SKIP=2, failures=1) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From othalan at othalan.net Mon Nov 9 09:27:19 2015 From: othalan at othalan.net (David Morris) Date: Mon, 9 Nov 2015 07:27:19 -0700 Subject: [Numpy-discussion] Question about structure arrays In-Reply-To: <1446931102879-41653.post@n7.nabble.com> References: <1446931102879-41653.post@n7.nabble.com> Message-ID: On Nov 7, 2015 2:58 PM, "aerojockey" wrote: > > Hello, > > Recently I made some changes to a program I'm working on, and found that the > changes made it four times slower than before. After some digging, I found > out that one of the new costs was that I added structure arrays. Inside a > low-level loop, I create a structure array, populate it Python, then turn it > over to some handwritten C code for processing. It turned out that, when > passed a structure array as a dtype, numpy has to parse the dtype, which > included calls to re.match and eval. > > Now, this is not a big deal for me to work around by using ordinary slicing > and such, and also I can improve things by reusing arrays. Since this is > inner loop stuff, sacrificing readability for speed is an appropriate > tradeoff. > > Nevertheless, I was curious if there was a way (or any plans for there to be > a way) to compile a struture array dtype. I realize it's not the > bread-and-butter of numpy, but it turned out to be a very convenient feature > for my use case (populating an array of structures to pass off to C). I was just looking into structured arrays. In case it is relevant: Are you using certain 1.10? They are apparently a LOT slower than 1.9.3, an issue which will be fixed in a future version. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.v.root at gmail.com Mon Nov 9 13:42:49 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Mon, 9 Nov 2015 13:42:49 -0500 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <56406418.1010500@ensta-bretagne.fr> References: <56406418.1010500@ensta-bretagne.fr> Message-ID: My personal rule for flexible inputs like that is that it should be encouraged so long as it does not introduce ambiguity. Furthermore, Allowing a scalar as an input doesn't add a congitive disconnect on the user on how to specify multiple columns. Therefore, I'd give this a +1. On Mon, Nov 9, 2015 at 4:15 AM, Irvin Probst wrote: > Hi, > I've recently seen many students, coming from Matlab, struggling against > the usecols argument of loadtxt. Most of them tried something like: > loadtxt("foo.bar", usecols=2) or the ones with better documentation > reading skills tried loadtxt("foo.bar", usecols=(2)) but none of them > understood they had to write usecols=[2] or usecols=(2,). > > Is there a policy in numpy stating that this kind of arguments must be > sequences ? I think that being able to an int or a sequence when a single > column is needed would make this function a bit more user friendly for > beginners. I would gladly submit a PR if noone disagrees. > > Regards. > > -- > Irvin > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Nov 9 14:36:57 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Nov 2015 20:36:57 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: References: <56406418.1010500@ensta-bretagne.fr> Message-ID: On Mon, Nov 9, 2015 at 7:42 PM, Benjamin Root wrote: > My personal rule for flexible inputs like that is that it should be > encouraged so long as it does not introduce ambiguity. Furthermore, > Allowing a scalar as an input doesn't add a congitive disconnect on the > user on how to specify multiple columns. Therefore, I'd give this a +1. > > On Mon, Nov 9, 2015 at 4:15 AM, Irvin Probst < > irvin.probst at ensta-bretagne.fr> wrote: > >> Hi, >> I've recently seen many students, coming from Matlab, struggling against >> the usecols argument of loadtxt. Most of them tried something like: >> loadtxt("foo.bar", usecols=2) or the ones with better documentation >> reading skills tried loadtxt("foo.bar", usecols=(2)) but none of them >> understood they had to write usecols=[2] or usecols=(2,). >> >> Is there a policy in numpy stating that this kind of arguments must be >> sequences ? > > There isn't. In many/most cases it's array_like, which means scalar, sequence or array. > I think that being able to an int or a sequence when a single column is >> needed would make this function a bit more user friendly for beginners. I >> would gladly submit a PR if noone disagrees. >> > +1 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Nov 9 18:53:03 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 9 Nov 2015 15:53:03 -0800 Subject: [Numpy-discussion] Question about structure arrays In-Reply-To: <1446931102879-41653.post@n7.nabble.com> References: <1446931102879-41653.post@n7.nabble.com> Message-ID: On Sat, Nov 7, 2015 at 1:18 PM, aerojockey wrote: > Inside a > low-level loop, I create a structure array, populate it Python, then turn > it > over to some handwritten C code for processing. can you do that inside bit of the low-level loop in C (or cython?) you often want to put the guts of your loop in C anyway... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Nov 9 19:43:37 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Nov 2015 17:43:37 -0700 Subject: [Numpy-discussion] Feedback on new argument positions for ma.dot and MaskedArray.dot In-Reply-To: References: <563FFE39.7060202@hawaii.edu> Message-ID: On Sun, Nov 8, 2015 at 8:43 PM, Nathaniel Smith wrote: > On Nov 8, 2015 6:00 PM, "Eric Firing" wrote: > > > > I also prefer that there be a single convention: either the "out" kwarg > is the end of the every signature, or it is the first kwarg in every > signature. It's a very special and unusual kwarg, so it should have a > standard location. > > For all ufuncs, out arguments come first immediately after in arguments, > so +1 for doing that for consistency. > Agree that that is what to shoot for. The particular problem with `ma.dot` is that it already has the `strict` argument where the new `out` argument should go. I propose the following steps. 1. For backward compatibility, start by adding new arguments to the end 2. Later raise FutureWarning on positional arguments that are out of place 3. Then make all but early arguments keyword only Once we have keyword only for a while, it would be possible to add some arguments back as positional arguments, but it might be best to keep them as keyword only as suggested above. For the current PR, this means that the dot method will have positional arguments in a different order than ma.dot. Alternatively, out could be made keyword only in both, although that would require fixing up some tests. There is really no magical solution that avoids all difficulties that I can see. Unless a consensus develops otherwise, I will pursue step 1. and go for a 1.10.2rc tomorrow. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Nov 9 19:54:01 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 9 Nov 2015 16:54:01 -0800 Subject: [Numpy-discussion] Feedback on new argument positions for ma.dot and MaskedArray.dot In-Reply-To: References: <563FFE39.7060202@hawaii.edu> Message-ID: On Mon, Nov 9, 2015 at 4:43 PM, Charles R Harris wrote: > > > On Sun, Nov 8, 2015 at 8:43 PM, Nathaniel Smith wrote: >> >> On Nov 8, 2015 6:00 PM, "Eric Firing" wrote: >> > >> > I also prefer that there be a single convention: either the "out" kwarg >> > is the end of the every signature, or it is the first kwarg in every >> > signature. It's a very special and unusual kwarg, so it should have a >> > standard location. >> >> For all ufuncs, out arguments come first immediately after in arguments, >> so +1 for doing that for consistency. > > > Agree that that is what to shoot for. The particular problem with `ma.dot` > is that it already has the `strict` argument where the new `out` argument > should go. I propose the following steps. > > 1. For backward compatibility, start by adding new arguments to the end > 2. Later raise FutureWarning on positional arguments that are out of place > 3. Then make all but early arguments keyword only > > Once we have keyword only for a while, it would be possible to add some > arguments back as positional arguments, but it might be best to keep them as > keyword only as suggested above. > > For the current PR, this means that the dot method will have positional > arguments in a different order than ma.dot. Alternatively, out could be made > keyword only in both, although that would require fixing up some tests. > There is really no magical solution that avoids all difficulties that I can > see. > > Unless a consensus develops otherwise, I will pursue step 1. and go for a > 1.10.2rc tomorrow. If we're adding it in a funny place to ma.dot now (the end of the arglist) with the plan of changing it later, then why not make it kwarg-only in ma.dot now to start with? If this turns out to be annoying somehow then go ahead with whatever as far I'm concerned -- I don't want to hold up 1.10.2 by trying to micro-optimize the transition path for an obscure corner of np.ma :-). -n -- Nathaniel J. Smith -- http://vorpus.org From sebastian at sipsolutions.net Tue Nov 10 03:19:33 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 10 Nov 2015 09:19:33 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: References: <56406418.1010500@ensta-bretagne.fr> Message-ID: <1447143573.2487.9.camel@sipsolutions.net> On Mo, 2015-11-09 at 20:36 +0100, Ralf Gommers wrote: > > > On Mon, Nov 9, 2015 at 7:42 PM, Benjamin Root > wrote: > My personal rule for flexible inputs like that is that it > should be encouraged so long as it does not introduce > ambiguity. Furthermore, Allowing a scalar as an input doesn't > add a congitive disconnect on the user on how to specify > multiple columns. Therefore, I'd give this a +1. > > > On Mon, Nov 9, 2015 at 4:15 AM, Irvin Probst > wrote: > Hi, > I've recently seen many students, coming from Matlab, > struggling against the usecols argument of loadtxt. > Most of them tried something like: > loadtxt("foo.bar", usecols=2) or the ones with better > documentation reading skills tried loadtxt("foo.bar", > usecols=(2)) but none of them understood they had to > write usecols=[2] or usecols=(2,). > > Is there a policy in numpy stating that this kind of > arguments must be sequences ? > > > There isn't. In many/most cases it's array_like, which means scalar, > sequence or array. > Agree, I think we have, or should have, to types of things there (well, three since we certainly have "must be sequence"). Args such as "axes" which is typically just one, so we allow scalar, but can often be generalized to a sequence. And things that are array-likes (and broadcasting). So, if this is an array-like, however, the "correct" result could be different by broadcasting between `1` and `(1,)` analogous to indexing the full array with usecols: usecols=1 result: array([2, 3, 4, 5]) usecols=(1,) result [1]: array([[2, 3, 4, 5]]) since a scalar row (so just one row) is read and not a 2D array. I tend to say it should be an array-like argument and not a generalized sequence argument, just wanted to note that, since I am not sure what matlab does. - Sebastian [1] could go further and do `usecols=[[1]]` and get `array([[[2, 3, 4, 5]]])` > > I think that being able to an int or a sequence when a > single column is needed would make this function a bit > more user friendly for beginners. I would gladly > submit a PR if noone disagrees. > > +1 > > > Ralf > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From irvin.probst at ensta-bretagne.fr Tue Nov 10 04:24:57 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Tue, 10 Nov 2015 10:24:57 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <1447143573.2487.9.camel@sipsolutions.net> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> Message-ID: <5641B7E9.2090802@ensta-bretagne.fr> On 10/11/2015 09:19, Sebastian Berg wrote: > since a scalar row (so just one row) is read and not a 2D array. I tend > to say it should be an array-like argument and not a generalized > sequence argument, just wanted to note that, since I am not sure what > matlab does. Hi, By default Matlab reads everything, silently fails on what can't be converted into a float and the user has to guess what was read or not. Say you have a file like this: 2010-01-01 00:00:00 3.026 2010-01-01 01:00:00 4.049 2010-01-01 02:00:00 4.865 >> M=load('CONCARNEAU_2010.txt'); >> M(1:3,:) ans = 1.0e+03 * 2.0100 0 0.0030 2.0100 0.0010 0.0040 2.0100 0.0020 0.0049 I think this is a terrible way of doing it even if newcomers might find this handy. There are of course optionnal arguments (even regexps !) but to my knowledge almost no Matlab user even knows these arguments are there. Anyway, I made a PR here https://github.com/numpy/numpy/pull/6656 with usecols as an array-like. Regards. From sebastian at sipsolutions.net Tue Nov 10 08:17:32 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 10 Nov 2015 14:17:32 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <5641B7E9.2090802@ensta-bretagne.fr> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> Message-ID: <1447161452.2487.15.camel@sipsolutions.net> On Di, 2015-11-10 at 10:24 +0100, Irvin Probst wrote: > On 10/11/2015 09:19, Sebastian Berg wrote: > > since a scalar row (so just one row) is read and not a 2D array. I tend > > to say it should be an array-like argument and not a generalized > > sequence argument, just wanted to note that, since I am not sure what > > matlab does. > > Hi, > By default Matlab reads everything, silently fails on what can't be > converted into a float and the user has to guess what was read or not. > Say you have a file like this: > > 2010-01-01 00:00:00 3.026 > 2010-01-01 01:00:00 4.049 > 2010-01-01 02:00:00 4.865 > > > >> M=load('CONCARNEAU_2010.txt'); > >> M(1:3,:) > > ans = > > 1.0e+03 * > > 2.0100 0 0.0030 > 2.0100 0.0010 0.0040 > 2.0100 0.0020 0.0049 > > > I think this is a terrible way of doing it even if newcomers might find > this handy. There are of course optionnal arguments (even regexps !) but > to my knowledge almost no Matlab user even knows these arguments are there. > > Anyway, I made a PR here https://github.com/numpy/numpy/pull/6656 with > usecols as an array-like. > Actually, it is the "sequence special case" type ;). (matlab does not have this, since matlab always returns 2-D I realized). As I said, if usecols is like indexing, the result should mimic: arr = np.loadtxt(f) arr = arr[usecols] in which case a 1-D array is returned if you put in a scalar into usecols (and you could even generalize usecols to higher dimensional array-likes). The way you implemented it -- which is fine, but I want to stress that there is a real decision being made here --, you always see it as a sequence but allow a scalar for convenience (i.e. always return a 2-D array). It is a `sequence of ints or int` type argument and not an array-like argument in my opinion. - Sebastian > Regards. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From irvin.probst at ensta-bretagne.fr Tue Nov 10 10:07:13 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Tue, 10 Nov 2015 16:07:13 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <1447161452.2487.15.camel@sipsolutions.net> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> Message-ID: <56420821.6010805@ensta-bretagne.fr> On 10/11/2015 14:17, Sebastian Berg wrote: > Actually, it is the "sequence special case" type ;). (matlab does not > have this, since matlab always returns 2-D I realized). > > As I said, if usecols is like indexing, the result should mimic: > > arr = np.loadtxt(f) > arr = arr[usecols] > > in which case a 1-D array is returned if you put in a scalar into > usecols (and you could even generalize usecols to higher dimensional > array-likes). > The way you implemented it -- which is fine, but I want to stress that > there is a real decision being made here --, you always see it as a > sequence but allow a scalar for convenience (i.e. always return a 2-D > array). It is a `sequence of ints or int` type argument and not an > array-like argument in my opinion. I think we have two separate problems here: The first one is whether loadtxt should always return a 2D array or should it match the shape of the usecol argument. From a CS guy point of view I do understand your concern here. Now from a teacher point of view I know many people expect to get a "matrix" (thank you Matlab...) and the "purity" of matching the dimension of the usecol variable will be seen by many people [1] as a nerdy useless heavyness noone cares of (no offense). So whatever you, seadoned numpy devs from this mailing list, decide I think it should be explained in the docstring with a very clear wording. My own opinion on this first problem is that loadtxt() should always return a 2D array, no less, no more. If I write np.loadtxt(f)[42] it means I want to read the whole file and then I explicitely ask for transforming the 2-D array loadtxt() returned into a 1-D array. Otoh if I write loadtxt(f, usecol=42) it means I don't want to read the other columns and I want only this one, but it does not mean that I want to change the returned array from 2-D to 1-D. I know this new behavior might break a lot of existing code as usecol=(42,) used to return a 1-D array, but usecol=((((42,)))) also returns a 1-D array so the current behavior is not consistent imho. The second problem is about the wording in the docstring, when I see "sequence of int or int" I uderstand I will have to cast into a 1-D python list whatever wicked N-dimensional object I use to store my column indexes, or hope list(my_object) will do it fine. On the other hand when I read "array-like" the function is telling me I don't have to worry about my object, as long as numpy knows how to cast it into an array it will be fine. Anyway I think something like that: import numpy as np a=[[[2,],[],[],],[],[],[]] foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) should just work and return me a 2-D (or 1-D if you like) array with the data I asked for and I don't think "a" here is an int or a sequence of int (but it's a good example of why loadtxt() should not match the shape of the usecol argument). To make it short, let the reading function read the data in a consistent and predictible way and then let the user explicitely change the data's shape into anything he likes. Regards. [1] read non CS people trying to switch to numpy/scipy From ben.v.root at gmail.com Tue Nov 10 10:24:40 2015 From: ben.v.root at gmail.com (Benjamin Root) Date: Tue, 10 Nov 2015 10:24:40 -0500 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <56420821.6010805@ensta-bretagne.fr> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> Message-ID: Just pointing out np.loadtxt(..., ndmin=2) will always return a 2D array. Notice that without that option, the result is effectively squeezed. So if you don't specify that option, and you load up a CSV file with only one row, you will get a very differently shaped array than if you load up a CSV file with two rows. Ben Root On Tue, Nov 10, 2015 at 10:07 AM, Irvin Probst < irvin.probst at ensta-bretagne.fr> wrote: > On 10/11/2015 14:17, Sebastian Berg wrote: > >> Actually, it is the "sequence special case" type ;). (matlab does not >> have this, since matlab always returns 2-D I realized). >> >> As I said, if usecols is like indexing, the result should mimic: >> >> arr = np.loadtxt(f) >> arr = arr[usecols] >> >> in which case a 1-D array is returned if you put in a scalar into >> usecols (and you could even generalize usecols to higher dimensional >> array-likes). >> The way you implemented it -- which is fine, but I want to stress that >> there is a real decision being made here --, you always see it as a >> sequence but allow a scalar for convenience (i.e. always return a 2-D >> array). It is a `sequence of ints or int` type argument and not an >> array-like argument in my opinion. >> > > I think we have two separate problems here: > > The first one is whether loadtxt should always return a 2D array or should > it match the shape of the usecol argument. From a CS guy point of view I do > understand your concern here. Now from a teacher point of view I know many > people expect to get a "matrix" (thank you Matlab...) and the "purity" of > matching the dimension of the usecol variable will be seen by many people > [1] as a nerdy useless heavyness noone cares of (no offense). So whatever > you, seadoned numpy devs from this mailing list, decide I think it should > be explained in the docstring with a very clear wording. > > My own opinion on this first problem is that loadtxt() should always > return a 2D array, no less, no more. If I write np.loadtxt(f)[42] it means > I want to read the whole file and then I explicitely ask for transforming > the 2-D array loadtxt() returned into a 1-D array. Otoh if I write > loadtxt(f, usecol=42) it means I don't want to read the other columns and I > want only this one, but it does not mean that I want to change the returned > array from 2-D to 1-D. I know this new behavior might break a lot of > existing code as usecol=(42,) used to return a 1-D array, but > usecol=((((42,)))) also returns a 1-D array so the current behavior is not > consistent imho. > > The second problem is about the wording in the docstring, when I see > "sequence of int or int" I uderstand I will have to cast into a 1-D python > list whatever wicked N-dimensional object I use to store my column indexes, > or hope list(my_object) will do it fine. On the other hand when I read > "array-like" the function is telling me I don't have to worry about my > object, as long as numpy knows how to cast it into an array it will be fine. > > Anyway I think something like that: > > import numpy as np > a=[[[2,],[],[],],[],[],[]] > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > should just work and return me a 2-D (or 1-D if you like) array with the > data I asked for and I don't think "a" here is an int or a sequence of int > (but it's a good example of why loadtxt() should not match the shape of the > usecol argument). > > To make it short, let the reading function read the data in a consistent > and predictible way and then let the user explicitely change the data's > shape into anything he likes. > > Regards. > > [1] read non CS people trying to switch to numpy/scipy > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidmenhur at gmail.com Tue Nov 10 10:52:52 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Tue, 10 Nov 2015 16:52:52 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <56420821.6010805@ensta-bretagne.fr> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> Message-ID: On 10 November 2015 at 16:07, Irvin Probst wrote: > I know this new behavior might break a lot of existing code as > usecol=(42,) used to return a 1-D array, but usecol=((((42,)))) also > returns a 1-D array so the current behavior is not consistent imho. ((((42,)))) is exactly the same as (42,) If you want a tuple of tuples, you have to do ((42,),), but then it raises: TypeError: list indices must be integers, not tuple. What numpy cares about is that whatever object you give it is iterable, and its entries are ints, so usecol={0:'a', 5:'b'} is perfectly valid. I think loadtxt should be a tool to read text files in the least surprising fashion, and a text file is a 1 or 2D container, so it shouldn't return any other shapes. Any fancy stuff one may want to do with the output should be done with the typical indexing tricks. If I want a single column, I would first be very surprised if I got a 2D array (I was bitten by this design in MATLAB many many times). For the rare cases where I do want a "fake" 2D array, I can make it explicit by expanding it with arr[:, np.newaxis], and then I know that the shape will be (N, 1) and not (1, N). Thus, usecols should be int or sequence of ints, and the result 1 or 2D. In your example: a=[[[2,],[],[],],[],[],[]] foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) What would the shape of foo be? /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Nov 10 10:57:26 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 10 Nov 2015 16:57:26 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> Message-ID: <1447171046.2487.22.camel@sipsolutions.net> On Di, 2015-11-10 at 10:24 -0500, Benjamin Root wrote: > Just pointing out np.loadtxt(..., ndmin=2) will always return a 2D > array. Notice that without that option, the result is effectively > squeezed. So if you don't specify that option, and you load up a CSV > file with only one row, you will get a very differently shaped array > than if you load up a CSV file with two rows. > Oh, well I personally think that default squeeze is an abomination :). Anyway, I just wanted to point out that it is two different possible logics, and we have to pick one. I have a slight preference for the indexing/array-like interpretation, but I am aware that from a usage point of view the sequence one is likely better. I could throw in another option: Throw an explicit error instead of the general. Anyway, I *really* do not have an opinion about what is better. Array-like would only suggest that you also accept buffer interface objects or array_interface stuff. Which in this case is really unnecessary I think. - Sebastian > > Ben Root > > > On Tue, Nov 10, 2015 at 10:07 AM, Irvin Probst > wrote: > On 10/11/2015 14:17, Sebastian Berg wrote: > Actually, it is the "sequence special case" type ;). > (matlab does not > have this, since matlab always returns 2-D I > realized). > > As I said, if usecols is like indexing, the result > should mimic: > > arr = np.loadtxt(f) > arr = arr[usecols] > > in which case a 1-D array is returned if you put in a > scalar into > usecols (and you could even generalize usecols to > higher dimensional > array-likes). > The way you implemented it -- which is fine, but I > want to stress that > there is a real decision being made here --, you > always see it as a > sequence but allow a scalar for convenience (i.e. > always return a 2-D > array). It is a `sequence of ints or int` type > argument and not an > array-like argument in my opinion. > > I think we have two separate problems here: > > The first one is whether loadtxt should always return a 2D > array or should it match the shape of the usecol argument. > From a CS guy point of view I do understand your concern here. > Now from a teacher point of view I know many people expect to > get a "matrix" (thank you Matlab...) and the "purity" of > matching the dimension of the usecol variable will be seen by > many people [1] as a nerdy useless heavyness noone cares of > (no offense). So whatever you, seadoned numpy devs from this > mailing list, decide I think it should be explained in the > docstring with a very clear wording. > > My own opinion on this first problem is that loadtxt() should > always return a 2D array, no less, no more. If I write > np.loadtxt(f)[42] it means I want to read the whole file and > then I explicitely ask for transforming the 2-D array > loadtxt() returned into a 1-D array. Otoh if I write > loadtxt(f, usecol=42) it means I don't want to read the other > columns and I want only this one, but it does not mean that I > want to change the returned array from 2-D to 1-D. I know this > new behavior might break a lot of existing code as > usecol=(42,) used to return a 1-D array, but > usecol=((((42,)))) also returns a 1-D array so the current > behavior is not consistent imho. > > The second problem is about the wording in the docstring, when > I see "sequence of int or int" I uderstand I will have to cast > into a 1-D python list whatever wicked N-dimensional object I > use to store my column indexes, or hope list(my_object) will > do it fine. On the other hand when I read "array-like" the > function is telling me I don't have to worry about my object, > as long as numpy knows how to cast it into an array it will be > fine. > > Anyway I think something like that: > > import numpy as np > a=[[[2,],[],[],],[],[],[]] > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > should just work and return me a 2-D (or 1-D if you like) > array with the data I asked for and I don't think "a" here is > an int or a sequence of int (but it's a good example of why > loadtxt() should not match the shape of the usecol argument). > > To make it short, let the reading function read the data in a > consistent and predictible way and then let the user > explicitely change the data's shape into anything he likes. > > Regards. > > [1] read non CS people trying to switch to numpy/scipy > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From irvin.probst at ensta-bretagne.fr Tue Nov 10 11:39:05 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Tue, 10 Nov 2015 17:39:05 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> Message-ID: <56421DA9.8030308@ensta-bretagne.fr> On 10/11/2015 16:52, Da?id wrote: > ((((42,)))) is exactly the same as (42,) If you want a tuple of > tuples, you have to do ((42,),), but then it raises: TypeError: list > indices must be integers, not tuple. My bad, I wrote that too fast, please forget this. > I think loadtxt should be a tool to read text files in the least > surprising fashion, and a text file is a 1 or 2D container, so it > shouldn't return any other shapes. And I *do* agree with the "shouldn't return any other shapes" part of your phrase. What I was trying to say, admitedly with a very bogus example, is that either loadtxt() should always output an array whose shape matches the shape of the object passed to usecol or it should never do it, and I'm if favor of never. I'm perfectly aware that what I suggest would break the current behavior of usecols=(2,) so I know it does not have the slightest probability of being accepted but still, I think that the "least surprising fashion" is to always return an 2-D array because for many, many, many people a text data file has N lines and M columns and N=1 or M=1 is not a specific case. Anyway I will of course modify my PR according to any decision made here. In your example: > > a=[[[2,],[],[],],[],[],[]] > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > What would the shape of foo be? As I said in my previous email: > should just work and return me a 2-D (or 1-D if you like) array with the data I asked for So, 1-D or 2-D it is up to you, but as long as there is no ambiguity in which columns the user is asking for it should imho work. Regards. From pythondev1 at aerojockey.com Wed Nov 11 00:40:32 2015 From: pythondev1 at aerojockey.com (aerojockey) Date: Tue, 10 Nov 2015 22:40:32 -0700 (MST) Subject: [Numpy-discussion] Question about structure arrays In-Reply-To: References: <1446931102879-41653.post@n7.nabble.com> Message-ID: <1447220432593-41676.post@n7.nabble.com> Nathaniel Smith wrote > On Sat, Nov 7, 2015 at 1:18 PM, aerojockey < > pythondev1@ > > wrote: >> Hello, >> >> Recently I made some changes to a program I'm working on, and found that >> the >> changes made it four times slower than before. After some digging, I >> found >> out that one of the new costs was that I added structure arrays. Inside >> a >> low-level loop, I create a structure array, populate it Python, then turn >> it >> over to some handwritten C code for processing. It turned out that, when >> passed a structure array as a dtype, numpy has to parse the dtype, which >> included calls to re.match and eval. >> >> Now, this is not a big deal for me to work around by using ordinary >> slicing >> and such, and also I can improve things by reusing arrays. Since this is >> inner loop stuff, sacrificing readability for speed is an appropriate >> tradeoff. >> >> Nevertheless, I was curious if there was a way (or any plans for there to >> be >> a way) to compile a struture array dtype. I realize it's not the >> bread-and-butter of numpy, but it turned out to be a very convenient >> feature >> for my use case (populating an array of structures to pass off to C). > > Does it help to turn your dtype string into a dtype object and then > pass the dtype object around? E.g. > > In [1]: dt = np.dtype("i4,i4") > > In [2]: np.zeros(2, dtype=dt) > Out[2]: > array([(0, 0), (0, 0)], > dtype=[('f0', '<i4'), ('f1', '<i4')]) > > -n I actually don't know, since I removed the structure array part about ten minutes after I posted. However, I did a quick test of your suggestion, and indeed numpy calls exec and re.match only when creating the dtype object, not when creating the array. So certainly it would have helped. I wasn't actually aware you could do that with dtypes. In fact, I was only vaguely that there were dtype types at all. Thanks for the suggestion. Carl Banks -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/Question-about-structure-arrays-tp41653p41676.html Sent from the Numpy-discussion mailing list archive at Nabble.com. From sebastian at sipsolutions.net Wed Nov 11 05:02:50 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 11 Nov 2015 11:02:50 +0100 Subject: [Numpy-discussion] Indexing NEP draft Message-ID: <1447236170.2487.43.camel@sipsolutions.net> Hi all, at scipy discussing with Nathaniel and others, we thought that maybe we can push for orthogonal type indexing into numpy. Now with the new version out and some other discussions done, I thought it is time to pick it up :). The basic ideas are twofold. First make indexing easier and less confusing for starters (and advanced users also), and second improve interoperability with projects such as xray for whom orthogonal/outer type indexing makes more sense. I have started working on: 1. A preliminary draft of an NEP you can view at https://github.com/numpy/numpy/pull/6256/files?short_path=01e4dd9#diff-01e4dd9d2ecf994b24e5883f98f789e6 or at the end of this mail. 2. A preliminary implementation of `oindex` attribute with orthogonal/outer style indexing in https://github.com/numpy/numpy/pull/6075 which you can try out by cloning numpy and then running from the source dir: git fetch upstream pull/6075/head:pr-6075 && git checkout pr-6075; python runtests.py --ipython This will fetch my PR, switch to the branch and open an interactive ipython shell where you will be able to do arr.oindex[]. Note that I consider the NEP quite preliminary in many parts, and it may still be very confusing unless you are well versed with current advanced indexing. There are some longer examples comparing the different styles and another "example" which tries to show a "use case" example going from simpler to more complex indexing operations. Any comments are very welcome, and if it is "I don't understand a word" :). I know it is probably too short and, at least without examples, not easy to understand. Best, Sebastian ================================================================================== The current NEP draft: ========================================================== Implementing intuitive and full featured advanced indexing ========================================================== :Author: Sebastian Berg :Date: 2015-08-27 :Status: draft Executive summary ================= Advanced indexing with multiple array indices is typically confusing to both new, and in many cases even old, users of NumPy. To avoid this problem and allow for more and clearer features, we propose to: 1. Introduce ``arr.oindex[indices]`` which allows advanced indices, but uses outer indexing logic. 2. Introduce ``arr.vindex[indices]`` which use the current "vectorized"/broadcasted logic but with two differences from fancy indexing: 1. Boolean indices always use the outer indexing logic. (Multi dimensional booleans should be allowed). 2. The integer index result dimensions are always the first axes of the result array. No transpose is done, even for a single integer array index. 3. Vanilla indexing on the array will only give warnings and eventually errors either: * when there is ambiguity between legacy fancy and outer indexing (note that ``arr[[1, 2], :, 0]`` is such a case, an integer can be the "second" integer index array), * when any integer index array is present (possibly additional for more then one boolean index array). These constraints are sufficient for making indexing generally consistent with expectations and providing a less surprising learning curve with ``oindex``. Note that all things mentioned here apply both for assignment as well as subscription. Understanding these details is *not* easy. The `Examples` section gives code examples. And the hopefully easier `Motivational Example` provides some motivational use-cases for the general ideas and is likely a good start for anyone not intimately familiar with advanced indexing. Motivation ========== Old style advanced indexing with multiple array (boolean or integer) indices, also called "fancy indexing", tends to be very confusing for new users. While fancy (or legacy) indexing is useful in many cases one would naively assume that the result of multiple 1-d ranges is analogous to multiple slices along each dimension (also called "outer indexing"). However, legacy fancy indexing with multiple arrays broadcasts these arrays into a single index over multiple dimensions. There are three main points of confusion when multiple array indices are involved: 1. Most new users will usually expect outer indexing (consistent with slicing). This is also the most common way of handling this in other packages or languages. 2. The axes introduced by the array indices are at the front, unless all array indices are consecutive, in which case one can deduce where the user "expects" them to be: * `arr[:, [0, 1], :, [0, 1]]` will have the first dimension shaped 2. * `arr[:, [0, 1], [0, 1]]` will have the second dimension shaped 2. 3. When a boolean array index is mixed with another boolean or integer array, the result is very hard to understand (the boolean array is converted to integer array indices and then broadcast), and hardly useful. There is no well defined broadcast for booleans, so that boolean indices are logically always "``outer``" type indices. Proposed rules ============== From the three problems noted above some expectations for NumPy can be deduced: 1. There should be a prominent outer/orthogonal indexing method such as ``arr.oindex[indices]``. 2. Considering how confusing fancy indexing can be, it should only occur explicitly (e.g. ``arr.vindex[indices]``) 3. A new ``arr.vindex[indices]`` method, would not be tied to the confusing transpose rules of fancy indexing (which is for example needed for the simple case of a single advanced index). Thus, it no transposing should be done. The axes of the advanced indices are always inserted at the front, even for a single index. 4. Boolean indexing is conceptionally outer indexing. A broadcasting together with other advanced indices in the manner of legacy "fancy indexing" is generally not helpful or well defined. A user who wishes the "``nonzero``" plus broadcast behaviour can thus be expected to do this manually. Using this rule, a single boolean index can index into multiple dimensions at once. 5. An ``arr.lindex`` or ``arr.findex`` should likely be implemented to allow legacy fancy indexing indefinetly. This also gives a simple way to update fancy indexing code making deprecations to vanilla indexing easier. 6. Vanilla indexing ``arr[...]`` could return an error for ambiguous cases. For the beginning, this probably means cases where ``arr[ind]`` and ``arr.oindex[ind]`` return different results gives deprecation warnings. However, the exact rules for this (especially the final behaviour) are not quite clear in cases such as ``arr[0, :, index_arr]``. All other rules for indexing are identical. Open Questions ============== 1. Especially for the new indexing attributes ``oindex`` and ``vindex``, a case could be made to not implicitly add an ``Ellipsis`` index if necessary. This helps finding bugs since a too high dimensional array can be caught. (I am in favor for this, but doubt we should think about this for vanilla indexing.) 2. The names ``oindex`` and ``vindex`` are just suggestions at the time of writing this, another name NumPy has used for something like ``oindex`` is ``np.ix_``. See also below. 3. It would be possible to limit the use of boolean indices in ``vindex``, assuming that they are rare and to some degree special. (This would make implementation simpler, but I do not see a big reason.) 4. ``oindex`` and ``vindex`` could always return copies, even when no array operation occurs. One argument for using the same rules is that this way ``oindex`` can be used as a general index replacement. (There is likely no big reason for this, however, there is one reason: ``arr.vindex[array_scalar, ...]`` can occur, where ``arr_scalar`` should be a 0-D array. Copying always "fixes" the possible inconsistency.) 5. The final state to morph indexing in is not fixed in this PEP. It is for example possible that `arr[index]`` will be equivalent to ``arr.oindex`` at some point in the future. Since such a change will take years, it seems unnecessary to make specific decisions now. 6. Proposed changes to vanilla indexing could be postponed indefinetly or not taken in order to not break or force fixing of existing code bases. 7. Possible the ``vindex`` combination with boolean indexing could be rethought or not allowed at all for simplicity. Necessary changes to NumPy ========================== Implement ``arr.oindex`` and ``arr.vindex`` objects to allow these indexing operations and create warnings (and eventually deprecate) ambiguous direct indexing operations on arrays. Alternative Names ================= Possible names suggested (more suggestions will be added). ============== ======== ======= **Orthogonal** oindex oix **Vectorized** vindex fix **Legacy** l/findex ============== ======== ======= Examples ======== Since the various kinds of indexing is hard to grasp in many cases, these examples hopefully give some more insights. Note that they are all in terms of shape. All original dimensions start with 5, advanced indexing inserts less long dimensions. (Note that ``...`` or ``Ellipsis`` mostly inserts as many slices as needed to index the full array). These examples may be hard to grasp without working knowledge of advanced indexing as of NumPy 1.9. Example array:: >>> arr = np.ones((5, 6, 7, 8)) Legacy fancy indexing --------------------- Single index is transposed (this is the same for all indexing types):: >>> arr[[0], ...].shape (1, 6, 7, 8) >>> arr[:, [0], ...].shape (5, 1, 7, 8) Multiple indices are transposed *if* consecutive:: >>> arr[:, [0], [0], :].shape # future error (5, 1, 7) >>> arr[:, [0], :, [0]].shape # future error (1, 5, 6) It is important to note that a scalar *is* integer array index in this sense (and gets broadcasted with the other advanced index):: >>> arr[:, [0], 0, :].shape # future error (scalar is "fancy") (5, 1, 7) >>> arr[:, [0], :, 0].shape # future error (scalar is "fancy") (1, 5, 6) Single boolean index can act on multiple dimensions (especially the whole array). It has to match (as of 1.10. a deprecation warning) the dimensions. The boolean index is otherwise identical to (multiple consecutive) integer array indices:: >>> # Create boolean index with one True value for the last two dimensions: >>> bindx = np.zeros((7, 8), dtype=np.bool_) >>> bindx[[0, 0]] = True >>> arr[:, 0, bindx].shape (5, 1) >>> arr[0, :, bindx].shape (1, 6) The combination with anything that is not a scalar is confusing, e.g.:: >>> arr[[0], :, bindx].shape # bindx result broadcasts with [0] (1, 6) >>> arr[:, [0, 1], bindx] # IndexError Outer indexing -------------- Multiple indices are "orthogonal" and their result axes are inserted at the same place (they are not broadcasted):: >>> arr.oindex[:, [0], [0, 1], :].shape (5, 1, 2, 8) >>> arr.oindex[:, [0], :, [0, 1]].shape (5, 1, 7, 2) >>> arr.oindex[:, [0], 0, :].shape (5, 1, 8) >>> arr.oindex[:, [0], :, 0].shape (5, 1, 7) Boolean indices results are always inserted where the index is:: >>> # Create boolean index with one True value for the last two dimensions: >>> bindx = np.zeros((7, 8), dtype=np.bool_) >>> bindx[[0, 0]] = True >>> arr.oindex[:, 0, bindx].shape (5, 1) >>> arr.oindex[0, :, bindx].shape (6, 1) Nothing changed in the presence of other advanced indices since:: >>> arr.oindex[[0], :, bindx].shape (1, 6, 1) >>> arr.oindex[:, [0, 1], bindx] (5, 2, 1) Vectorized/inner indexing ------------------------- Multiple indices are broadcasted and iterated as one like fancy indexing, but the new axes area always inserted at the front:: >>> arr.vindex[:, [0], [0, 1], :].shape (2, 5, 8) >>> arr.vindex[:, [0], :, [0, 1]].shape (2, 5, 7) >>> arr.vindex[:, [0], 0, :].shape (1, 5, 8) >>> arr.vindex[:, [0], :, 0].shape (1, 5, 7) Boolean indices results are always inserted where the index is, exactly as in ``oindex`` given how specific they are to the axes they operate on:: >>> # Create boolean index with one True value for the last two dimensions: >>> bindx = np.zeros((7, 8), dtype=np.bool_) >>> bindx[[0, 0]] = True >>> arr.vindex[:, 0, bindx].shape (5, 1) >>> arr.vindex[0, :, bindx].shape (6, 1) But other advanced indices are again transposed to the front:: >>> arr.vindex[[0], :, bindx].shape (1, 6, 1) >>> arr.vindex[:, [0, 1], bindx] (2, 5, 1) Related Questions ================= There exist a further indexing or indexing like method. That is the inverse of a command such as ``np.argmin(arr, axis=axis)``, to pick the specific elements *along* an axis given an array of (at least typically) the same size. Doing such a thing with the indexing notation is not quite straight forward since the axis on which to pick elements has to be supplied. One plausible solution would be to create a function (calling it pick here for simplicity):: np.pick(arr, index_arr, axis=axis) where ``index_arr`` has to be the same shape as ``arr`` except along ``axis``. One could imagine that this can be useful together with other indexing types, but such a function may be sufficient and extra information needed seems easier to pass using a function convention. Another option would be to allow an argument such as ``compress_axes=None`` (just to have some name) which maps the axes from the index array to the new array with ``None`` signaling a new axis. Also keepdims could be added as a simple default. (Note that the use of axis is not compatible to ``np.take`` for an ``index_arr`` which is not zero or one dimensional.) Another solution is to provide functions or features to the ``arg*``functions to map this to the equivalent ``vindex`` indexing operation. Motivational Example ==================== Imagine having a data acquisition software storing ``D`` channels and ``N`` datapoints along the time. She stores this into an ``(N, D)`` shaped array. During data analysis, we needs to fetch a pool of channels, for example to calculate a mean over them. This data can be faked using:: >>> arr = np.random.random((100, 10)) Now one may remember indexing with an integer array and find the correct code:: >>> group = arr[:, [2, 5]] >>> mean_value = arr.mean() However, assume that there were some specific time points (first dimension of the data) that need to be specially considered. These time points are already known and given by:: >>> interesting_times = np.array([1, 5, 8, 10], dtype=np.intp) Now to fetch them, we may try to modify the previous code:: >>> group_at_it = arr[interesting_times, [2, 5]] IndexError: Ambiguous index, use `.oindex` or `.vindex` An error such as this will point to read up the indexing documentation. This should make it clear, that ``oindex`` behaves more like slicing. So, out of the different methods it is the obvious choice (for now, this is a shape mismatch, but that could possibly also mention ``oindex``):: >>> group_at_it = arr.oindex[interesting_times, [2, 5]] Now of course one could also have used ``vindex``, but it is much less obvious how to achieve the right thing!:: >>> reshaped_times = interesting_times[:, np.newaxis] >>> group_at_it = arr.vindex[reshaped_times, [2, 5]] One may find, that for example our data is corrupt in some places. So, we need to replace these values by zero (or anything else) for these times. The first column may for example give the necessary information, so that changing the values becomes easy remembering boolean indexing:: >>> bad_data = arr[0] > 0.5 >>> arr[bad_data, :] = 0 Again, however, the columns may need to be handled more individually (but in groups), and the ``oindex`` attribute works well:: >>> arr.oindex[bad_data, [2, 5]] = 0 Note that it would be very hard to do this using legacy fancy indexing. The only way would be to create an integer array first:: >>> bad_data_indx = np.nonzero(bad_data)[0] >>> bad_data_indx_reshaped = bad_data_indx[:, np.newaxis] >>> arr[bad_data_indx_reshaped, [2, 5]] In any case we can use only ``oindex`` to do all of this without getting into any trouble or confused by the whole complexity of advanced indexing. But, some new features are added to the data acquisition. Different sensors have to be used depending on the times. Let us assume we already have created an array of indices:: >>> correct_sensors = np.random.randint(10, size=(100, 2)) Which lists for each time the two correct sensors in an ``(N, 2)`` array. A first try to achieve this may be ``arr[:, correct_sensors]`` and this does not work. It should be clear quickly that slicing cannot achieve the desired thing. But hopefully users will remember that there is ``vindex`` as a more powerful and flexible approach to advanced indexing. One may, if trying ``vindex`` randomly, be confused about:: >>> new_arr = arr.vindex[:, correct_sensors] which is neither the same, nor the correct result (see transposing rules)! This is because slicing works still the same in ``vindex``. However, reading the documentation and examples, one can hopefully quickly find the desired solution:: >>> rows = np.arange(len(arr)) >>> rows = rows[:, np.newaxis] # make shape fit with correct_sensors >>> new_arr = arr.vindex[rows, correct_sensors] At this point we have left the straight forward world of ``oindex`` but can do random picking of any element from the array. Note that in the last example a method such as mentioned in the ``Related Questions`` section could be more straight forward. But this approach is even more flexible, since ``rows`` does not have to be a simple ``arange``, but could be ``intersting_times``:: >>> correct_sensors_at_it = correct_sensors[interesting_times, :] >>> interesting_times_reshaped = interesting_times[:, np.newaxis] >>> new_arr_it = arr[interesting_times_reshaped, correct_sensors_at_it] Truly complex situation would arise now if you would for example pool ``L`` experiments into an array shaped ``(L, N, D)``. But for ``oindex`` this should not result into surprises. ``vindex``, being more powerful, will quite certainly create some confusion in this case but also cover pretty much all eventualities. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Wed Nov 11 12:38:50 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 11 Nov 2015 18:38:50 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <56421DA9.8030308@ensta-bretagne.fr> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> <56421DA9.8030308@ensta-bretagne.fr> Message-ID: <1447263530.2487.54.camel@sipsolutions.net> On Di, 2015-11-10 at 17:39 +0100, Irvin Probst wrote: > On 10/11/2015 16:52, Da?id wrote: > > ((((42,)))) is exactly the same as (42,) If you want a tuple of > > tuples, you have to do ((42,),), but then it raises: TypeError: list > > indices must be integers, not tuple. > > My bad, I wrote that too fast, please forget this. > > > I think loadtxt should be a tool to read text files in the least > > surprising fashion, and a text file is a 1 or 2D container, so it > > shouldn't return any other shapes. > > And I *do* agree with the "shouldn't return any other shapes" part of > your phrase. What I was trying to say, admitedly with a very bogus > example, is that either loadtxt() should always output an array whose > shape matches the shape of the object passed to usecol or it should > never do it, and I'm if favor of never. Sounds fine to me, and considering the squeeze logic (which I think is unfortunate, but it is not something you can easily change), I would be for simply adding logic to accept a single integral argument and otherwise not change anything. I am personally against the flattening and even the array-like logic [1] currently in the PR, it seems like arbitrary generality for my taste without any obvious application. As said before, the other/additional thing that might be very helpful is trying to give a more useful error message. - Sebastian [1] Almost all 1-d array-likes will be sequences/iterables in any case, those that are not are so obscure that there is no point in explicitly supporting them. > I'm perfectly aware that what I suggest would break the current behavior > of usecols=(2,) so I know it does not have the slightest probability of > being accepted but still, I think that the "least surprising fashion" is > to always return an 2-D array because for many, many, many people a text > data file has N lines and M columns and N=1 or M=1 is not a specific case. > > Anyway I will of course modify my PR according to any decision made here. > > In your example: > > > > a=[[[2,],[],[],],[],[],[]] > > foo=np.loadtxt("CONCARNEAU_2010.txt", usecols=a) > > > > What would the shape of foo be? > > As I said in my previous email: > > > should just work and return me a 2-D (or 1-D if you like) array with > the data I asked for > > So, 1-D or 2-D it is up to you, but as long as there is no ambiguity in > which columns the user is asking for it should imho work. > > Regards. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From charlesr.harris at gmail.com Thu Nov 12 16:11:14 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Nov 2015 14:11:14 -0700 Subject: [Numpy-discussion] Numpy 1.10.2rc1 Message-ID: Hi All, I am pleased to announce the release of Numpy 1.10.2rc1. This release should fix the problems exposed in 1.10.1, which is not to say there are no remaining problems. Please test this thoroughly, exspecially if you experienced problems with 1.10.1. Julian Taylor has opened an issue relating to cblas detection on Debian (and probably Debian derived distributions) that is not dealt with in this release. Hopefully a solution will be available before the final. To all who reported issues with 1.10.1 and to those who helped close them, a big thank you. Source and binary files may be found on Sourceforge . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From irvin.probst at ensta-bretagne.fr Fri Nov 13 05:51:54 2015 From: irvin.probst at ensta-bretagne.fr (Irvin Probst) Date: Fri, 13 Nov 2015 11:51:54 +0100 Subject: [Numpy-discussion] loadtxt and usecols In-Reply-To: <1447263530.2487.54.camel@sipsolutions.net> References: <56406418.1010500@ensta-bretagne.fr> <1447143573.2487.9.camel@sipsolutions.net> <5641B7E9.2090802@ensta-bretagne.fr> <1447161452.2487.15.camel@sipsolutions.net> <56420821.6010805@ensta-bretagne.fr> <56421DA9.8030308@ensta-bretagne.fr> <1447263530.2487.54.camel@sipsolutions.net> Message-ID: <5645C0CA.1060506@ensta-bretagne.fr> On 11/11/2015 18:38, Sebastian Berg wrote: > > Sounds fine to me, and considering the squeeze logic (which I think is > unfortunate, but it is not something you can easily change), I would be > for simply adding logic to accept a single integral argument and > otherwise not change anything. > [...] > > As said before, the other/additional thing that might be very helpful is > trying to give a more useful error message. > I've modified my PR to (hopefully) match these requests. https://github.com/numpy/numpy/pull/6656 Regards. -- Irvin From charlesr.harris at gmail.com Fri Nov 13 13:06:37 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Nov 2015 11:06:37 -0700 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi Message-ID: Hi All, I think 1.10.0 and 1.10.1 are sufficiently buggy that they should be removed from circulation as soon as 1.10.2 comes out. The inner product bug for non contiguous arrays is particularly egregious. It is not customary to remove outdated Numpy releases from sourceforge and pypi, but I'd like to make an exception for those two. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Fri Nov 13 13:21:51 2015 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 13 Nov 2015 10:21:51 -0800 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References: Message-ID: Hi, On Fri, Nov 13, 2015 at 10:06 AM, Charles R Harris wrote: > Hi All, > > I think 1.10.0 and 1.10.1 are sufficiently buggy that they should be removed > from circulation as soon as 1.10.2 comes out. The inner product bug for non > contiguous arrays is particularly egregious. It is not customary to remove > outdated Numpy releases from sourceforge and pypi, but I'd like to make an > exception for those two. First pass, that doesn't seem like a good idea to me. No-one will get these releases unless they ask for them specifically, once 1.10.2 is out. Imagine for example that someone does have one of these versions already and is making a bug report, we might want to test against those versions to replicate the bug. Cheers, Matthew From manolo at austrohungaro.com Fri Nov 13 13:22:32 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Fri, 13 Nov 2015 19:22:32 +0100 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References: Message-ID: <20151113182232.GA6748@beagle.localdomain> On 11/13/15 at 11:06am, Charles R Harris wrote: > The inner product bug > for non contiguous arrays is particularly egregious. Could you please post a link to the related issue? I have been seeing very strange things with scipy.integrate.odeint, and I wonder if they are related. Thanks, Manolo From charlesr.harris at gmail.com Fri Nov 13 14:16:09 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Nov 2015 12:16:09 -0700 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: <20151113182232.GA6748@beagle.localdomain> References: <20151113182232.GA6748@beagle.localdomain> Message-ID: On Fri, Nov 13, 2015 at 11:22 AM, Manolo Mart?nez wrote: > On 11/13/15 at 11:06am, Charles R Harris wrote: > > The inner product bug > > for non contiguous arrays is particularly egregious. > > Could you please post a link to the related issue? I have been seeing > very strange things with scipy.integrate.odeint, and I wonder if they > are related. > Here you go: https://github.com/numpy/numpy/issues/6532 . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From manolo at austrohungaro.com Fri Nov 13 14:26:01 2015 From: manolo at austrohungaro.com (Manolo =?iso-8859-1?Q?Mart=EDnez?=) Date: Fri, 13 Nov 2015 20:26:01 +0100 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References: <20151113182232.GA6748@beagle.localdomain> Message-ID: <20151113192601.GA8191@beagle.localdomain> On 11/13/15 at 12:16pm, Charles R Harris wrote: > On Fri, Nov 13, 2015 at 11:22 AM, Manolo Mart?nez > wrote: > > > On 11/13/15 at 11:06am, Charles R Harris wrote: > > > The inner product bug > > > for non contiguous arrays is particularly egregious. > > > > Could you please post a link to the related issue? I have been seeing > > very strange things with scipy.integrate.odeint, and I wonder if they > > are related. > > > > Here you go: https://github.com/numpy/numpy/issues/6532 > . > Thanks! M > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion -- From njs at pobox.com Fri Nov 13 14:50:23 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 13 Nov 2015 11:50:23 -0800 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References: Message-ID: On Nov 13, 2015 10:06 AM, "Charles R Harris" wrote: > > Hi All, > > I think 1.10.0 and 1.10.1 are sufficiently buggy that they should be removed from circulation as soon as 1.10.2 comes out. The inner product bug for non contiguous arrays is particularly egregious. It is not customary to remove outdated Numpy releases from sourceforge and pypi, but I'd like to make an exception for those two. > Can you elaborate on what you're trying to accomplish? Like Matthew says, they'll effectively be removed from circulation once 1.10.2 is released, regardless of whether we actually delete the files. But deleting the files does make it difficult to do legitimate things. (Example: rerunning an analysis with both 1.10.1 and 1.10.2 to check whether some published results were affected by one of the bugs in 1.10.1.) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Nov 13 15:04:36 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Nov 2015 13:04:36 -0700 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References:

Message-ID: On Fri, Nov 13, 2015 at 12:50 PM, Nathaniel Smith wrote: > On Nov 13, 2015 10:06 AM, "Charles R Harris" > wrote: > > > > Hi All, > > > > I think 1.10.0 and 1.10.1 are sufficiently buggy that they should be > removed from circulation as soon as 1.10.2 comes out. The inner product bug > for non contiguous arrays is particularly egregious. It is not customary to > remove outdated Numpy releases from sourceforge and pypi, but I'd like to > make an exception for those two. > > > > Can you elaborate on what you're trying to accomplish? Like Matthew says, > they'll effectively be removed from circulation once 1.10.2 is released, > regardless of whether we actually delete the files. But deleting the files > does make it difficult to do legitimate things. (Example: rerunning an > analysis with both 1.10.1 and 1.10.2 to check whether some published > results were affected by one of the bugs in 1.10.1.) > Basically, they should never be used, but they will always be tagged in the repo. That said, if the consensus is to leave them up I won't be bothered much. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Nov 14 04:21:36 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 14 Nov 2015 10:21:36 +0100 Subject: [Numpy-discussion] Removiing 1.10.0 and 1.10.1 from sourceforge and pypi In-Reply-To: References:

Message-ID: On Fri, Nov 13, 2015 at 9:04 PM, Charles R Harris wrote: > > > On Fri, Nov 13, 2015 at 12:50 PM, Nathaniel Smith