From jtaylor.debian at googlemail.com Sun Mar 1 16:05:43 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sun, 01 Mar 2015 22:05:43 +0100 Subject: [Numpy-discussion] ANN: NumPy 1.9.2 bugfix release Message-ID: <54F37F27.6090601@googlemail.com> Hi, We am pleased to announce the release of NumPy 1.9.2, a bugfix only release for the 1.9.x series. The tarballs and win32 binaries are available on sourceforge: https://sourceforge.net/projects/numpy/files/NumPy/1.9.2/ PyPI also contains the wheels for MacOs. The upgrade is recommended for all users of the 1.9.x series. Following issues have been fixed: * #5316: fix too large dtype alignment of strings and complex types * #5424: fix ma.median when used on ndarrays * #5481: Fix astype for structured array fields of different byte order * #5354: fix segfault when clipping complex arrays * #5524: allow np.argpartition on non ndarrays * #5612: Fixes ndarray.fill to accept full range of uint64 * #5155: Fix loadtxt with comments=None and a string None data * #4476: Masked array view fails if structured dtype has datetime component * #5388: Make RandomState.set_state and RandomState.get_state threadsafe * #5390: make seed, randint and shuffle threadsafe * #5374: Fixed incorrect assert_array_almost_equal_nulp documentation * #5393: Add support for ATLAS > 3.9.33. * #5313: PyArray_AsCArray caused segfault for 3d arrays * #5492: handle out of memory in rfftf * #4181: fix a few bugs in the random.pareto docstring * #5359: minor changes to linspace docstring * #4723: fix a compile issues on AIX Cheers, The NumPy Developer team -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From charlesr.harris at gmail.com Sun Mar 1 16:18:25 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sun, 1 Mar 2015 14:18:25 -0700 Subject: [Numpy-discussion] ANN: NumPy 1.9.2 bugfix release In-Reply-To: <54F37F27.6090601@googlemail.com> References: <54F37F27.6090601@googlemail.com> Message-ID: On Sun, Mar 1, 2015 at 2:05 PM, Julian Taylor wrote: > Hi, > > > We am pleased to announce the release of NumPy 1.9.2, a > bugfix only release for the 1.9.x series. > The tarballs and win32 binaries are available on sourceforge: > https://sourceforge.net/projects/numpy/files/NumPy/1.9.2/ > PyPI also contains the wheels for MacOs. > > The upgrade is recommended for all users of the 1.9.x series. > > Following issues have been fixed: > * #5316: fix too large dtype alignment of strings and complex types > * #5424: fix ma.median when used on ndarrays > * #5481: Fix astype for structured array fields of different byte order > * #5354: fix segfault when clipping complex arrays > * #5524: allow np.argpartition on non ndarrays > * #5612: Fixes ndarray.fill to accept full range of uint64 > * #5155: Fix loadtxt with comments=None and a string None data > * #4476: Masked array view fails if structured dtype has datetime component > * #5388: Make RandomState.set_state and RandomState.get_state threadsafe > * #5390: make seed, randint and shuffle threadsafe > * #5374: Fixed incorrect assert_array_almost_equal_nulp documentation > * #5393: Add support for ATLAS > 3.9.33. > * #5313: PyArray_AsCArray caused segfault for 3d arrays > * #5492: handle out of memory in rfftf > * #4181: fix a few bugs in the random.pareto docstring > * #5359: minor changes to linspace docstring > * #4723: fix a compile issues on AIX > > Cheers, > The NumPy Developer team > > > Thanks Julian. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 2 17:28:06 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Mar 2015 14:28:06 -0800 Subject: [Numpy-discussion] Congratulations to Chris Barker... Message-ID: ...on the acceptance of his PEP! PEP 485 adds a math.isclose function to the standard library, encouraging people to do numerically more reasonable floating point comparisons. The PEP: https://www.python.org/dev/peps/pep-0485/ The pronouncement: http://thread.gmane.org/gmane.comp.python.devel/151776/focus=151778 -n -- Nathaniel J. Smith -- http://vorpus.org From shoyer at gmail.com Mon Mar 2 17:57:17 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 2 Mar 2015 14:57:17 -0800 Subject: [Numpy-discussion] [SciPy-User] Congratulations to Chris Barker... In-Reply-To: References: Message-ID: Indeed, congratulations Chris! Are there plans to write a vectorized version for NumPy? :) On Mon, Mar 2, 2015 at 2:28 PM, Nathaniel Smith wrote: > ...on the acceptance of his PEP! PEP 485 adds a math.isclose function > to the standard library, encouraging people to do numerically more > reasonable floating point comparisons. > > The PEP: > https://www.python.org/dev/peps/pep-0485/ > > The pronouncement: > http://thread.gmane.org/gmane.comp.python.devel/151776/focus=151778 > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Mon Mar 2 19:51:21 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 2 Mar 2015 16:51:21 -0800 Subject: [Numpy-discussion] [SciPy-User] Congratulations to Chris Barker... In-Reply-To: References: Message-ID: On Mon, Mar 2, 2015 at 2:57 PM, Stephan Hoyer wrote: > Indeed, congratulations Chris! > > Are there plans to write a vectorized version for NumPy? :) np.isclose isn't identical, but IIRC the only difference is the defaults. -n -- Nathaniel J. Smith -- http://vorpus.org From chris.barker at noaa.gov Mon Mar 2 20:23:12 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 2 Mar 2015 17:23:12 -0800 Subject: [Numpy-discussion] [SciPy-User] Congratulations to Chris Barker... In-Reply-To: References: Message-ID: <-5458918514508544704@unknownmsgid> >> Are there plans to write a vectorized version for NumPy? :) > > np.isclose isn't identical, but IIRC the only difference is the defaults. There are subtle differences in the algorithm as well. But not enough that it makes sense to change the numpy one. The results will be similar in most cases, and identical fir a relative tolerance less than 1e-8 (for float64). -Chris > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Mon Mar 2 20:23:35 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 2 Mar 2015 18:23:35 -0700 Subject: [Numpy-discussion] [SciPy-User] Congratulations to Chris Barker... In-Reply-To: References: Message-ID: On Mon, Mar 2, 2015 at 5:51 PM, Nathaniel Smith wrote: > On Mon, Mar 2, 2015 at 2:57 PM, Stephan Hoyer wrote: > > Indeed, congratulations Chris! > > > > Are there plans to write a vectorized version for NumPy? :) > > np.isclose isn't identical, but IIRC the only difference is the defaults. > > There are two differences I saw. Numpy requires the error to be less than the sum of the relative and absolute errors, the pep uses the maximum of the two. The other is that numpy as a key word for nans to compare equal. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From muh_kosta at mail.ru Tue Mar 3 00:39:20 2015 From: muh_kosta at mail.ru (=?UTF-8?B?0JrQvtC90YHRgtCw0L3RgtC40L0g0JzRg9GF0LDQvNC10LTQttCw0L3QvtCy?=) Date: Tue, 03 Mar 2015 08:39:20 +0300 Subject: [Numpy-discussion] =?utf-8?q?quastion_numpy_build?= Message-ID: <1425361160.361959607@f316.i.mail.ru> Hi,Sorry for my English(I am from Russia).I have such quastion:how can I build extension of numpy binaries using setup.py file.How can I specify the compiler program to setup.py.I had not found the option --compiler as it was ,when I had build another package extension.I have Mingw32 compiler. Regard -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Mar 3 05:54:35 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 3 Mar 2015 10:54:35 +0000 (UTC) Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal References: Message-ID: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Daniel Sank wrote: > It seems unnecessarily convoluted to name the input arguments "loc" and > "scale", then immediately define them as the "mean" and "standard > deviation" in the Parameters section, and then again rename them as "mu" > and "sigma" in the written formula. I propose to simply change the argument > names to "mean" and "sigma" to improve consistency. Change the name of keyword arguments? This would be evil. From shoyer at gmail.com Tue Mar 3 14:01:48 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 3 Mar 2015 11:01:48 -0800 Subject: [Numpy-discussion] ANN: xray v0.4 released Message-ID: I'm pleased to announce a major release of xray, v0.4. xray is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures. Our goal is to provide a pandas-like and pandas-compatible toolkit for analytics on multi-dimensional arrays, rather than the tabular data for which pandas excels. Our approach adopts the Common Data Model for self-describing scientific data in widespread use in the Earth sciences: xray.Dataset is an in-memory representation of a netCDF file. Documentation: http://xray.readthedocs.org/ GitHub: https://github.com/xray/xray Highlights of this release: * Automatic alignment of index labels in arithmetic and when combining arrays or datasets. * Aggregations like mean now skip missing values by default. * Relaxed equality rules in concat and merge for variables with equal value(s) but different shapes. * New drop method for dropping variables or index labels. * Support for reindexing with a fill method like pandas. For more details, read the full release notes: http://xray.readthedocs.org/en/stable/whats-new.html Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 3 19:11:58 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 3 Mar 2015 17:11:58 -0700 Subject: [Numpy-discussion] linalg.norm probems Message-ID: Hi All, This is with reference to issue #5626 . Currently linalg.norm converts the input like so `x = asarray(x)`. This can produce integer arrays, which in turn may create problems of overflow, or the failure of the abs functions for minimum values of signed integer types. I propose to convert the input to a minimum precision of float32. However, this will be a change in behavior. I'd guess that that might not be much of a problem, as otherwise it is likely that this problem would have been reported earlier. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Tue Mar 3 19:21:39 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Tue, 3 Mar 2015 16:21:39 -0800 Subject: [Numpy-discussion] linalg.norm probems In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 4:11 PM, Charles R Harris wrote: > Hi All, > > This is with reference to issue #5626 > . Currently linalg.norm > converts the input like so `x = asarray(x)`. This can produce integer > arrays, which in turn may create problems of overflow, or the failure of > the abs functions for minimum values of signed integer types. I propose to > convert the input to a minimum precision of float32. However, this will be > a change in behavior. I'd guess that that might not be much of a problem, > as otherwise it is likely that this problem would have been reported > earlier. > > Thoughts? > Not sure if it makes sense here, but elsewhere (I think it was polyval) we let object arrays through unchanged. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 3 19:31:28 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 3 Mar 2015 17:31:28 -0700 Subject: [Numpy-discussion] linalg.norm probems In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 5:21 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Tue, Mar 3, 2015 at 4:11 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> This is with reference to issue #5626 >> . Currently linalg.norm >> converts the input like so `x = asarray(x)`. This can produce integer >> arrays, which in turn may create problems of overflow, or the failure of >> the abs functions for minimum values of signed integer types. I propose to >> convert the input to a minimum precision of float32. However, this will be >> a change in behavior. I'd guess that that might not be much of a problem, >> as otherwise it is likely that this problem would have been reported >> earlier. >> >> Thoughts? >> > > Not sure if it makes sense here, but elsewhere (I think it was polyval) we > let object arrays through unchanged. > That would still work. I'm thinking something like x = asarray(x) dt = result_type(x, np.float32) if x.dtype.type is not dt.type: x = x.astype(dt) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 3 19:34:34 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 3 Mar 2015 17:34:34 -0700 Subject: [Numpy-discussion] linalg.norm probems In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 5:31 PM, Charles R Harris wrote: > > > On Tue, Mar 3, 2015 at 5:21 PM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> On Tue, Mar 3, 2015 at 4:11 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> This is with reference to issue #5626 >>> . Currently linalg.norm >>> converts the input like so `x = asarray(x)`. This can produce integer >>> arrays, which in turn may create problems of overflow, or the failure of >>> the abs functions for minimum values of signed integer types. I propose to >>> convert the input to a minimum precision of float32. However, this will be >>> a change in behavior. I'd guess that that might not be much of a problem, >>> as otherwise it is likely that this problem would have been reported >>> earlier. >>> >>> Thoughts? >>> >> >> Not sure if it makes sense here, but elsewhere (I think it was polyval) >> we let object arrays through unchanged. >> > > That would still work. I'm thinking something like > > x = asarray(x) > dt = result_type(x, np.float32) > if x.dtype.type is not dt.type: > x = x.astype(dt) > > I'd actually like to add a `min_dtype` keyword to asarray, We need it in several places. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 3 21:12:44 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 4 Mar 2015 03:12:44 +0100 Subject: [Numpy-discussion] linalg.norm probems In-Reply-To: References: Message-ID: On Wed, Mar 4, 2015 at 1:34 AM, Charles R Harris wrote: > > > On Tue, Mar 3, 2015 at 5:31 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Mar 3, 2015 at 5:21 PM, Jaime Fern?ndez del R?o < >> jaime.frio at gmail.com> wrote: >> >>> On Tue, Mar 3, 2015 at 4:11 PM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> This is with reference to issue #5626 >>>> . Currently linalg.norm >>>> converts the input like so `x = asarray(x)`. This can produce integer >>>> arrays, which in turn may create problems of overflow, or the failure of >>>> the abs functions for minimum values of signed integer types. I propose to >>>> convert the input to a minimum precision of float32. However, this will be >>>> a change in behavior. I'd guess that that might not be much of a problem, >>>> as otherwise it is likely that this problem would have been reported >>>> earlier. >>>> >>>> Thoughts? >>>> >>> >>> Not sure if it makes sense here, but elsewhere (I think it was polyval) >>> we let object arrays through unchanged. >>> >> >> That would still work. I'm thinking something like >> >> x = asarray(x) >> dt = result_type(x, np.float32) >> if x.dtype.type is not dt.type: >> x = x.astype(dt) >> >> > I'd actually like to add a `min_dtype` keyword to asarray, We need it in > several places. > That sounds like a good idea. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sank.daniel at gmail.com Wed Mar 4 00:59:19 2015 From: sank.daniel at gmail.com (Daniel Sank) Date: Tue, 3 Mar 2015 21:59:19 -0800 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: Sturia, > Change the name of keyword arguments? > This would be evil. Yes, I see this now. It is really a shame these were defined as keyword arguments. I've shown the docstring to a few people and so far they all agree that it is unnecessarily convoluted. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Wed Mar 4 19:27:40 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 4 Mar 2015 17:27:40 -0700 Subject: [Numpy-discussion] linalg.norm probems In-Reply-To: References: Message-ID: On Tue, Mar 3, 2015 at 7:12 PM, Ralf Gommers wrote: > > > On Wed, Mar 4, 2015 at 1:34 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Tue, Mar 3, 2015 at 5:31 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Tue, Mar 3, 2015 at 5:21 PM, Jaime Fern?ndez del R?o < >>> jaime.frio at gmail.com> wrote: >>> >>>> On Tue, Mar 3, 2015 at 4:11 PM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> This is with reference to issue #5626 >>>>> . Currently linalg.norm >>>>> converts the input like so `x = asarray(x)`. This can produce integer >>>>> arrays, which in turn may create problems of overflow, or the failure of >>>>> the abs functions for minimum values of signed integer types. I propose to >>>>> convert the input to a minimum precision of float32. However, this will be >>>>> a change in behavior. I'd guess that that might not be much of a problem, >>>>> as otherwise it is likely that this problem would have been reported >>>>> earlier. >>>>> >>>>> Thoughts? >>>>> >>>> >>>> Not sure if it makes sense here, but elsewhere (I think it was polyval) >>>> we let object arrays through unchanged. >>>> >>> >>> That would still work. I'm thinking something like >>> >>> x = asarray(x) >>> dt = result_type(x, np.float32) >>> if x.dtype.type is not dt.type: >>> x = x.astype(dt) >>> >>> >> I'd actually like to add a `min_dtype` keyword to asarray, We need it in >> several places. >> > > That sounds like a good idea. > Not sure what idea you are referring to, but I"ve added a `precision` keyword in gh- 5634. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnd.baecker at web.de Thu Mar 5 02:52:21 2015 From: arnd.baecker at web.de (Anrd Baecker) Date: Thu, 5 Mar 2015 08:52:21 +0100 (CET) Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 Message-ID: Dear all, when preparing the transition of our repositories from python 2 to python 3, I encountered a problem loading pytables (.h5) files generated using python 2. I suspect that it is caused by a problem with pickling numpy arrays under python 3: The code appended at the end of this mail works fine on either python 2.7 or python 3.4, however, generating the data on python 2 and trying to load them on python 3 gives some strange string ( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...) instead of [array([ 0., 1., 2., 3., 4., 5.]), array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])] The problem sounds very similar to the one reported here https://github.com/numpy/numpy/issues/4879 which was fixed with numpy 1.9. I tried different versions/combintations of numpy (including 1.9.2) and always end up with the above result. Also I tried to reduce the problem down to the level of pure numpy and pickle (as in the above bug report): import numpy as np import pickle arr1 = np.linspace(0.0, 1.0, 2) arr2 = np.linspace(0.0, 2.0, 3) data = [arr1, arr2] p = pickle.dumps(data) print(pickle.loads(p)) p Using the resulting string for p as input string (with b added at the beginnung) under python 3 gives UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 14: ordinal not in range(128) Can someone reproduce the problem with pytables? Is there maybe work-around? (And no: I can't re-generate the "old" data files - it's hundreds of .h5 files ... ;-). Many thanks, best, Arnd ############################################################################## """Illustrate problem with pytables data - python 2 to python 3.""" from __future__ import print_function import sys import numpy as np import tables as tb def main(): """Run the example.""" print("np.__version__=", np.__version__) check_on_same_version = False arr1 = np.linspace(0.0, 5.0, 6) arr2 = np.linspace(0.0, 10.0, 11) data = [arr1, arr2] # Only generate on python 2.X or check on the same python version: if sys.version < "3.0" or check_on_same_version: fpt = tb.open_file("tstdat.h5", mode="w") fpt.set_node_attr(fpt.root, "list_of_arrays", data) fpt.close() # Load the saved file: fpt = tb.open_file("tstdat.h5", mode="r") result = fpt.get_node_attr("/", "list_of_arrays") fpt.close() print("Loaded:", result) main() From jaakko.luttinen at aalto.fi Thu Mar 5 03:49:30 2015 From: jaakko.luttinen at aalto.fi (Jaakko Luttinen) Date: Thu, 5 Mar 2015 10:49:30 +0200 Subject: [Numpy-discussion] ANN: BayesPy 0.3 released Message-ID: <54F8189A.6090005@aalto.fi> Dear all, I am pleased to announce that BayesPy 0.3 has been released. BayesPy provides tools for variational Bayesian inference. The user can easily constuct conjugate exponential family models from nodes and run approximate posterior inference. BayesPy aims to be efficient and flexible enough for experts but also accessible for casual users. ----------------------------------------------------------------------- This release adds several state-of-the-art VB features. Below is a list of significant new features in this release: * Gradient-based optimization of the nodes by using either the Euclidean or Riemannian/natural gradient. This enables, for instance, the Riemannian conjugate gradient method. * Collapsed variational inference to improve the speed of learning. * Stochastic variational inference to improve scalability. * Pattern search to improve the speed of learning. * Deterministic annealing to improve robustness against initializations. * Gaussian Markov chains can use input signals. More details about the new features can be found here: http://www.bayespy.org/user_guide/advanced.html ------------------------------------------------------ PyPI: https://pypi.python.org/pypi/bayespy/0.3 Git repository: https://github.com/bayespy/bayespy Documentation: http://www.bayespy.org/ Best regards, Jaakko From stefanv at berkeley.edu Thu Mar 5 05:04:20 2015 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 05 Mar 2015 02:04:20 -0800 Subject: [Numpy-discussion] ANN: scikit-image 0.11 Message-ID: <874mpzu5ej.fsf@berkeley.edu> Announcement: scikit-image 0.11.0 ================================= We're happy to announce the release of scikit-image v0.11.0! scikit-image is an image processing toolbox for SciPy that includes algorithms for segmentation, geometric transformations, color space manipulation, analysis, filtering, morphology, feature detection, and more. For more information, examples, and documentation, please visit our website: http://scikit-image.org Highlights ---------- For this release, we merged over 200 pull requests with bug fixes, cleanups, improved documentation and new features. Highlights include: - Region Adjacency Graphs - Color distance RAGs (#1031) - Threshold Cut on RAGs (#1031) - Similarity RAGs (#1080) - Normalized Cut on RAGs (#1080) - RAG drawing (#1087) - Hierarchical merging (#1100) - Sub-pixel shift registration (#1066) - Non-local means denoising (#874) - Sliding window histogram (#1127) - More illuminants in color conversion (#1130) - Handling of CMYK images (#1360) - `stop_probability` for RANSAC (#1176) - Li thresholding (#1376) - Signed edge operators (#1240) - Full ndarray support for `peak_local_max` (#1355) - Improve conditioning of geometric transformations (#1319) - Standardize handling of multi-image files (#1200) - Ellipse structuring element (#1298) - Multi-line drawing tool (#1065), line handle style (#1179) - Point in polygon testing (#1123) - Rotation around a specified center (#1168) - Add `shape` option to drawing functions (#1222) - Faster regionprops (#1351) - `skimage.future` package (#1365) - More robust I/O module (#1189) API Changes ----------- - The ``skimage.filter`` subpackage has been renamed to ``skimage.filters``. - Some edge detectors returned values greater than 1--their results are now appropriately scaled with a factor of ``sqrt(2)``. Contributors to this release ---------------------------- (Listed alphabetically by last name) - Fedor Baart - Vighnesh Birodkar - Fran?ois Boulogne - Nelson Brown - Alexey Buzmakov - Julien Coste - Phil Elson - Adam Feuer - Jim Fienup - Geoffrey French - Emmanuelle Gouillart - Charles Harris - Jonathan Helmus - Alexander Iacchetta - Ivana Kaji? - Kevin Keraudren - Almar Klein - Gregory R. Lee - Jeremy Metz - Stuart Mumford - Damian Nadales - Pablo M?rquez Neila - Juan Nunez-Iglesias - Rebecca Roisin - Jasper St. Pierre - Jacopo Sabbatini - Michael Sarahan - Salvatore Scaramuzzino - Phil Schaf - Johannes Sch?nberger - Tim Seifert - Arve Seljebu - Steven Silvester - Julian Taylor - Mat?j T?? - Alexey Umnov - Pratap Vardhan - Stefan van der Walt - Joshua Warner - Tony S Yu From charlesr.harris at gmail.com Thu Mar 5 11:40:20 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Mar 2015 09:40:20 -0700 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. Message-ID: Hi All, This is apropos gh-5634 , a PR adding a precision keyword to asarray and asanyarray. The PR description is The precision keyword differs from the current dtype keyword in the > following way. > > - It specifies a minimum precision. If the precision of the input is > greater than the specified precision, the input precision is preserved. > - Complex types are preserved. A specifies floating precision applies > to the dtype of the real and complex parts separately. > > For example, both complex128 and float64 dtypes have the > same precision and an array of dtype float64 will be unchanged if the > specified precision is float32. > > Ideally the precision keyword would be pushed down into the array > constructor so that the resulting dtype could be determined before the > array is constructed, but that would require adding new functions as the > current constructors are part of the API and cannot have their > signatures changed. > The name of the keyword is open to discussion, as well as its acceptable values. And of course, anything else that might come to mind ;) Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Mar 5 11:42:24 2015 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 5 Mar 2015 11:42:24 -0500 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: dare I say... datetime64/timedelta64 support? ::ducks:: Ben Root On Thu, Mar 5, 2015 at 11:40 AM, Charles R Harris wrote: > Hi All, > > This is apropos gh-5634 , a PR > adding a precision keyword to asarray and asanyarray. The PR description is > > The precision keyword differs from the current dtype keyword in the >> following way. >> >> - It specifies a minimum precision. If the precision of the input is >> greater than the specified precision, the input precision is preserved. >> - Complex types are preserved. A specifies floating precision applies >> to the dtype of the real and complex parts separately. >> >> For example, both complex128 and float64 dtypes have the >> same precision and an array of dtype float64 will be unchanged if the >> specified precision is float32. >> >> Ideally the precision keyword would be pushed down into the array >> constructor so that the resulting dtype could be determined before the >> array is constructed, but that would require adding new functions as the >> current constructors are part of the API and cannot have their >> signatures changed. >> > The name of the keyword is open to discussion, as well as its acceptable > values. And of course, anything else that might come to mind ;) > > Thoughts? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Mar 5 12:04:59 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 5 Mar 2015 09:04:59 -0800 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root wrote: > dare I say... datetime64/timedelta64 support? > well, the precision of those is 64 bits, yes? so if you asked for less than that, you'd still get a dt64. If you asked for 64 bits, you'd get it, if you asked for datetime128 -- what would you get??? a 128 bit integer? or an Exception, because there is no 128bit datetime dtype. But I think this is the same problem with any dtype -- if you ask for a precision that doesn't exist, you're going to get an error. Is there a more detailed description of the proposed feature anywhere? Do you specify a dtype as a precision? or jsut the precision, and let the dtype figure it out for itself, i.e.: precision=64 would give you a float64 if the passed in array was a float type, but a int64 if the passed in array was an int type, or a uint64 if the passed in array was a unsigned int type, etc..... But in the end, I wonder about the use case. I generaly use asarray one of two ways: Without a dtype -- to simple make sure I've got an ndarray of SOME dtype. or With a dtype - because I really care about the dtype -- usually because I need to pass it on to C code or something. I don't think I'd ever need at least some precision, but not care if I got more than that.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Thu Mar 5 12:11:56 2015 From: ben.root at ou.edu (Benjamin Root) Date: Thu, 5 Mar 2015 12:11:56 -0500 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Thu, Mar 5, 2015 at 12:04 PM, Chris Barker wrote: > well, the precision of those is 64 bits, yes? so if you asked for less > than that, you'd still get a dt64. If you asked for 64 bits, you'd get it, > if you asked for datetime128 -- what would you get??? > > a 128 bit integer? or an Exception, because there is no 128bit datetime > dtype. > I was more thinking of datetime64/timedelta64's ability to specify the time units. Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 5 12:33:52 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Mar 2015 10:33:52 -0700 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker wrote: > On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root wrote: > >> dare I say... datetime64/timedelta64 support? >> > > well, the precision of those is 64 bits, yes? so if you asked for less > than that, you'd still get a dt64. If you asked for 64 bits, you'd get it, > if you asked for datetime128 -- what would you get??? > > a 128 bit integer? or an Exception, because there is no 128bit datetime > dtype. > > But I think this is the same problem with any dtype -- if you ask for a > precision that doesn't exist, you're going to get an error. > > Is there a more detailed description of the proposed feature anywhere? Do > you specify a dtype as a precision? or jsut the precision, and let the > dtype figure it out for itself, i.e.: > > precision=64 > > would give you a float64 if the passed in array was a float type, but a > int64 if the passed in array was an int type, or a uint64 if the passed in > array was a unsigned int type, etc..... > > But in the end, I wonder about the use case. I generaly use asarray one > of two ways: > > Without a dtype -- to simple make sure I've got an ndarray of SOME dtype. > > or > > With a dtype - because I really care about the dtype -- usually because I > need to pass it on to C code or something. > > I don't think I'd ever need at least some precision, but not care if I got > more than that... > The main use that I want to cover is that float64 and complex128 have the same precision and it would be good if either is acceptable. Also, one might just want either float32 or float64, not just one of the two. Another intent is to make the fewest possible copies. The determination of the resulting type is made using the result_type function. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 5 13:09:08 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Mar 2015 11:09:08 -0700 Subject: [Numpy-discussion] appveyor CI Message-ID: Anyone familiar with appveyor ? Is this something we could use to test/build numpy on windows machines? It is free for open source. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Thu Mar 5 14:42:21 2015 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 05 Mar 2015 11:42:21 -0800 Subject: [Numpy-discussion] appveyor CI In-Reply-To: References: Message-ID: <87wq2vs02q.fsf@berkeley.edu> Hi Chuck On 2015-03-05 10:09:08, Charles R Harris wrote: > Anyone familiar with appveyor ? Is > this something we could use to test/build numpy on windows > machines? It is free for open source. We already use this for scikit-image, and you are welcome to grab the setup here: https://github.com/scikit-image/scikit-image/blob/master/appveyor.yml GitHub now also supports multiple status reporting out of the box: https://github.com/blog/1935-see-results-from-all-pull-request-status-checks St?fan From denis.engemann at gmail.com Thu Mar 5 14:45:12 2015 From: denis.engemann at gmail.com (Denis-Alexander Engemann) Date: Thu, 5 Mar 2015 20:45:12 +0100 Subject: [Numpy-discussion] appveyor CI In-Reply-To: <87wq2vs02q.fsf@berkeley.edu> References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: Same for MNE-Python: https://github.com/mne-tools/mne-python/blob/master/appveyor.yml Denis 2015-03-05 20:42 GMT+01:00 Stefan van der Walt : > Hi Chuck > > On 2015-03-05 10:09:08, Charles R Harris > wrote: > > Anyone familiar with appveyor ? Is > > this something we could use to test/build numpy on windows > > machines? It is free for open source. > > We already use this for scikit-image, and you are welcome to grab > the setup here: > > https://github.com/scikit-image/scikit-image/blob/master/appveyor.yml > > GitHub now also supports multiple status reporting out of the box: > > > https://github.com/blog/1935-see-results-from-all-pull-request-status-checks > > St?fan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 5 15:14:12 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Mar 2015 13:14:12 -0700 Subject: [Numpy-discussion] appveyor CI In-Reply-To: <87wq2vs02q.fsf@berkeley.edu> References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: On Thu, Mar 5, 2015 at 12:42 PM, Stefan van der Walt wrote: > Hi Chuck > > On 2015-03-05 10:09:08, Charles R Harris > wrote: > > Anyone familiar with appveyor ? Is > > this something we could use to test/build numpy on windows > > machines? It is free for open source. > > We already use this for scikit-image, and you are welcome to grab > the setup here: > > https://github.com/scikit-image/scikit-image/blob/master/appveyor.yml > > GitHub now also supports multiple status reporting out of the box: > > > https://github.com/blog/1935-see-results-from-all-pull-request-status-checks > > Thanks. Anything tricky about setting up an appveyor account? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rnelsonchem at gmail.com Thu Mar 5 15:35:15 2015 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Thu, 5 Mar 2015 15:35:15 -0500 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: Message-ID: This works if run from Py3. Don't know if it will *always* work. From that GH discussion you linked, it sounds like that is a bit of a hack. ############## """Illustrate problem with pytables data - python 2 to python 3.""" from __future__ import print_function import sys import numpy as np import tables as tb import pickle as pkl def main(): """Run the example.""" print("np.__version__=", np.__version__) check_on_same_version = False arr1 = np.linspace(0.0, 5.0, 6) arr2 = np.linspace(0.0, 10.0, 11) data = [arr1, arr2] # Only generate on python 2.X or check on the same python version: if sys.version < "3.0" or check_on_same_version: fpt = tb.open_file("tstdat.h5", mode="w") fpt.set_node_attr(fpt.root, "list_of_arrays", data) fpt.close() # Load the saved file: fpt = tb.open_file("tstdat.h5", mode="r") result = fpt.get_node_attr("/", "list_of_arrays") fpt.close() print("Loaded:", pkl.loads(result, encoding="latin1")) main() ############### However, I would consider defining some sort of v2 of your HDF file format, which converts all of the lists of arrays to CArrays or EArrays in the HDF file. (https://pytables.github.io/usersguide/libref/homogenous_storage.html) Otherwise, what is the advantage of using HDF files over just plain shelves?... Just a thought. Ryan On Thu, Mar 5, 2015 at 2:52 AM, Anrd Baecker wrote: > Dear all, > > when preparing the transition of our repositories from python 2 > to python 3, I encountered a problem loading pytables (.h5) files > generated using python 2. > I suspect that it is caused by a problem with pickling numpy arrays > under python 3: > > The code appended at the end of this mail works > fine on either python 2.7 or python 3.4, however, > generating the data on python 2 and trying to load > them on python 3 gives some strange string > ( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...) > instead of > [array([ 0., 1., 2., 3., 4., 5.]), > array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])] > > The problem sounds very similar to the one reported here > https://github.com/numpy/numpy/issues/4879 > which was fixed with numpy 1.9. > > I tried different versions/combintations of numpy (including 1.9.2) > and always end up with the above result. > Also I tried to reduce the problem down to the level of pure numpy > and pickle (as in the above bug report): > > import numpy as np > import pickle > arr1 = np.linspace(0.0, 1.0, 2) > arr2 = np.linspace(0.0, 2.0, 3) > data = [arr1, arr2] > > p = pickle.dumps(data) > print(pickle.loads(p)) > p > > Using the resulting string for p as input string > (with b added at the beginnung) under python 3 gives > UnicodeDecodeError: 'ascii' codec can't decode > byte 0xf0 in position 14: ordinal not in range(128) > > > Can someone reproduce the problem with pytables? > Is there maybe work-around? > (And no: I can't re-generate the "old" data files - it's > hundreds of .h5 files ... ;-). > > Many thanks, best, Arnd > > > ############################################################################## > """Illustrate problem with pytables data - python 2 to python 3.""" > > from __future__ import print_function > > import sys > import numpy as np > import tables as tb > > > def main(): > """Run the example.""" > print("np.__version__=", np.__version__) > check_on_same_version = False > > arr1 = np.linspace(0.0, 5.0, 6) > arr2 = np.linspace(0.0, 10.0, 11) > data = [arr1, arr2] > > # Only generate on python 2.X or check on the same python version: > if sys.version < "3.0" or check_on_same_version: > fpt = tb.open_file("tstdat.h5", mode="w") > fpt.set_node_attr(fpt.root, "list_of_arrays", data) > fpt.close() > > # Load the saved file: > fpt = tb.open_file("tstdat.h5", mode="r") > result = fpt.get_node_attr("/", "list_of_arrays") > fpt.close() > print("Loaded:", result) > > main() > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnd.baecker at web.de Thu Mar 5 17:52:48 2015 From: arnd.baecker at web.de (Arnd Baecker) Date: Thu, 5 Mar 2015 23:52:48 +0100 (CET) Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: Message-ID: On Thu, 5 Mar 2015, Ryan Nelson wrote: > This works if run from Py3. Don't know if it will *always* work. From that GH discussion you linked, it sounds > like that is a bit of a hack. Great - based on your code I could modify my loader routine so that on python 3 it can load the files generated on python 2. Many thanks! Still I would have thought that this should be working out-of-the box, i.e. without the pickle.loads trick? [... code ...] > However, I would consider defining some sort of v2 of your HDF file format, which converts all of the lists of > arrays to CArrays or EArrays in the HDF file. > (https://pytables.github.io/usersguide/libref/homogenous_storage.html) Otherwise, what is the advantage of using > HDF files over just plain shelves?... Just a thought. Thanks for the suggestion - in our usage scenario lists of arrays is a border case and only small parts of the data in the files have this. The larger arrays are written directly. So at this point I don't mind if the lists of arrays are written in the current way (as long as things load fine). For our applications the main benefit of using HDF files is the possibility to easily look into them (e.g. using vitables) - so this means that I don't use all the nice more advance features of HDF at this point... ;-). Again many thanks for the prompt reply and solution! Best, Arnd > Ryan > > On Thu, Mar 5, 2015 at 2:52 AM, Anrd Baecker wrote: > Dear all, > > when preparing the transition of our repositories from python 2 > to python 3, I encountered a problem loading pytables (.h5) files > generated using python 2. > I suspect that it is caused by a problem with pickling numpy arrays > under python 3: > > The code appended at the end of this mail works > fine on either python 2.7 or python 3.4, however, > generating the data on python 2 and trying to load > them on python 3 gives some strange string > ( b'(lp1\ncnumpy.core.multiarray\n_reconstruct\np2\n(cnumpy\nndarray ...) > instead of > ? ? [array([ 0.,? 1.,? 2.,? 3.,? 4.,? 5.]), > ? ? ?array([ 0.,? 1.,? 2.,? 3.,? 4.,? 5.,? 6.,? 7.,? 8.,? 9., 10.])] > > The problem sounds very similar to the one reported here > ? ?https://github.com/numpy/numpy/issues/4879 > which was fixed with numpy 1.9. > > I tried different versions/combintations of numpy (including 1.9.2) > and always end up with the above result. > Also I tried to reduce the problem down to the level of pure numpy > and pickle (as in the above bug report): > > ? ?import numpy as np > ? ?import pickle > ? ?arr1 = np.linspace(0.0, 1.0, 2) > ? ?arr2 = np.linspace(0.0, 2.0, 3) > ? ?data = [arr1, arr2] > > ? ?p = pickle.dumps(data) > ? ?print(pickle.loads(p)) > ? ?p > > Using the resulting string for p as input string > (with b added at the beginnung) under python 3 gives > ? ?UnicodeDecodeError: 'ascii' codec can't decode > ? ?byte 0xf0 in position 14: ordinal not in range(128) > > > Can someone reproduce the problem with pytables? > Is there maybe work-around? > (And no: I can't re-generate the "old" data files - it's > hundreds of .h5 files ... ;-). > > Many thanks, best, Arnd > > ############################################################################## > """Illustrate problem with pytables data - python 2 to python 3.""" > > from __future__ import print_function > > import sys > import numpy as np > import tables as tb > > > def main(): > ? ? ?"""Run the example.""" > ? ? ?print("np.__version__=", np.__version__) > ? ? ?check_on_same_version = False > > ? ? ?arr1 = np.linspace(0.0, 5.0, 6) > ? ? ?arr2 = np.linspace(0.0, 10.0, 11) > ? ? ?data = [arr1, arr2] > > ? ? ?# Only generate on python 2.X or check on the same python version: > ? ? ?if sys.version < "3.0" or check_on_same_version: > ? ? ? ? ?fpt = tb.open_file("tstdat.h5", mode="w") > ? ? ? ? ?fpt.set_node_attr(fpt.root, "list_of_arrays", data) > ? ? ? ? ?fpt.close() > > ? ? ?# Load the saved file: > ? ? ?fpt = tb.open_file("tstdat.h5", mode="r") > ? ? ?result = fpt.get_node_attr("/", "list_of_arrays") > ? ? ?fpt.close() > ? ? ?print("Loaded:", result) > > main() > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > From rmcgibbo at gmail.com Thu Mar 5 19:38:34 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Thu, 5 Mar 2015 16:38:34 -0800 Subject: [Numpy-discussion] appveyor CI In-Reply-To: References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: >From my experience, it's pretty easy, assuming you're prepared to pick up some powershell. Some useful resources are - Olivier Grisel's example. https://github.com/ogrisel/python-appveyor-demo - I made a similar example, using conda. https://github.com/rmcgibbo/python-appveyor-conda-example One problem is that appveyor is often quite slow compared to TravisCI, so this can be a little annoying. But it's better than nothing. -Robert -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Mar 5 20:07:56 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 5 Mar 2015 18:07:56 -0700 Subject: [Numpy-discussion] appveyor CI In-Reply-To: References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: On Thu, Mar 5, 2015 at 5:38 PM, Robert McGibbon wrote: > From my experience, it's pretty easy, assuming you're prepared to pick up > some powershell. > Some useful resources are > > - Olivier Grisel's example. > https://github.com/ogrisel/python-appveyor-demo > - I made a similar example, using conda. > https://github.com/rmcgibbo/python-appveyor-conda-example > > One problem is that appveyor is often quite slow compared to TravisCI, so > this can be a little annoying. But it's better than nothing. > Do line endings in the scripts matter? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Thu Mar 5 21:36:25 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Thu, 5 Mar 2015 18:36:25 -0800 Subject: [Numpy-discussion] appveyor CI In-Reply-To: References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: I develop on linux and osx, and I haven't experienced any Appveyor problems related to line endings, so I assume it's normalized somehow. -Robert On Mar 5, 2015 5:08 PM, "Charles R Harris" wrote: > > > On Thu, Mar 5, 2015 at 5:38 PM, Robert McGibbon > wrote: > >> From my experience, it's pretty easy, assuming you're prepared to pick up >> some powershell. >> Some useful resources are >> >> - Olivier Grisel's example. >> https://github.com/ogrisel/python-appveyor-demo >> - I made a similar example, using conda. >> https://github.com/rmcgibbo/python-appveyor-conda-example >> >> One problem is that appveyor is often quite slow compared to TravisCI, so >> this can be a little annoying. But it's better than nothing. >> > > Do line endings in the scripts matter? > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 6 00:02:10 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Mar 2015 00:02:10 -0500 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris wrote: > > > On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker wrote: >> >> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root wrote: >>> >>> dare I say... datetime64/timedelta64 support? >> >> >> well, the precision of those is 64 bits, yes? so if you asked for less >> than that, you'd still get a dt64. If you asked for 64 bits, you'd get it, >> if you asked for datetime128 -- what would you get??? >> >> a 128 bit integer? or an Exception, because there is no 128bit datetime >> dtype. >> >> But I think this is the same problem with any dtype -- if you ask for a >> precision that doesn't exist, you're going to get an error. >> >> Is there a more detailed description of the proposed feature anywhere? Do >> you specify a dtype as a precision? or jsut the precision, and let the dtype >> figure it out for itself, i.e.: >> >> precision=64 >> >> would give you a float64 if the passed in array was a float type, but a >> int64 if the passed in array was an int type, or a uint64 if the passed in >> array was a unsigned int type, etc..... >> >> But in the end, I wonder about the use case. I generaly use asarray one >> of two ways: >> >> Without a dtype -- to simple make sure I've got an ndarray of SOME dtype. >> >> or >> >> With a dtype - because I really care about the dtype -- usually because I >> need to pass it on to C code or something. >> >> I don't think I'd ever need at least some precision, but not care if I got >> more than that... > > > The main use that I want to cover is that float64 and complex128 have the > same precision and it would be good if either is acceptable. Also, one > might just want either float32 or float64, not just one of the two. Another > intent is to make the fewest possible copies. The determination of the > resulting type is made using the result_type function. How does this work for object arrays, or datetime? Can I specify at least float32 or float64, and it raises an exception if it cannot be converted? The problem we have in statsmodels is that pandas frequently uses object arrays and it messes up patsy or statsmodels if it's not explicitly converted. Josef > > Chuck > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jaime.frio at gmail.com Fri Mar 6 00:33:01 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 5 Mar 2015 21:33:01 -0800 Subject: [Numpy-discussion] ufuncs now take a tuple of arrays as 'out' kwarg Message-ID: Hi all, There is a PR, ready to be merged, that adds the possibility of passing a tuple of arrays in the 'out' kwarg to ufuncs with multiple outputs: https://github.com/numpy/numpy/pull/5621 The new functionality is as follows: * If the ufunc has a single output, then the 'out' kwarg can either be a single array (or None) like today, or a tuple holding a single array (or None). * If the ufunc has more than one output, then the 'out' kwarg must be a tuple with one array (or None) per output argument. The old behavior, where only the first output could be specified, is now deprecated, will raise a deprecation warning, and potentially be changed to an error in the future. * In both cases, positional and keyword output arguments are incompatible. This has been made a little more strict, as the following is valid in <= 1.9.x but will now raise an error: np.add(2, 2, None, out=arr) There seemed to be a reasonable amount of agreement on the goodness of this change from the discussions on github, but I wanted to inform the larger audience, in case there are any addressable concerns. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 6 07:59:16 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 6 Mar 2015 05:59:16 -0700 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Thu, Mar 5, 2015 at 10:02 PM, wrote: > On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris > wrote: > > > > > > On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker > wrote: > >> > >> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root wrote: > >>> > >>> dare I say... datetime64/timedelta64 support? > >> > >> > >> well, the precision of those is 64 bits, yes? so if you asked for less > >> than that, you'd still get a dt64. If you asked for 64 bits, you'd get > it, > >> if you asked for datetime128 -- what would you get??? > >> > >> a 128 bit integer? or an Exception, because there is no 128bit datetime > >> dtype. > >> > >> But I think this is the same problem with any dtype -- if you ask for a > >> precision that doesn't exist, you're going to get an error. > >> > >> Is there a more detailed description of the proposed feature anywhere? > Do > >> you specify a dtype as a precision? or jsut the precision, and let the > dtype > >> figure it out for itself, i.e.: > >> > >> precision=64 > >> > >> would give you a float64 if the passed in array was a float type, but a > >> int64 if the passed in array was an int type, or a uint64 if the passed > in > >> array was a unsigned int type, etc..... > >> > >> But in the end, I wonder about the use case. I generaly use asarray one > >> of two ways: > >> > >> Without a dtype -- to simple make sure I've got an ndarray of SOME > dtype. > >> > >> or > >> > >> With a dtype - because I really care about the dtype -- usually because > I > >> need to pass it on to C code or something. > >> > >> I don't think I'd ever need at least some precision, but not care if I > got > >> more than that... > > > > > > The main use that I want to cover is that float64 and complex128 have the > > same precision and it would be good if either is acceptable. Also, one > > might just want either float32 or float64, not just one of the two. > Another > > intent is to make the fewest possible copies. The determination of the > > resulting type is made using the result_type function. > > > How does this work for object arrays, or datetime? > > Can I specify at least float32 or float64, and it raises an exception > if it cannot be converted? > > The problem we have in statsmodels is that pandas frequently uses > object arrays and it messes up patsy or statsmodels if it's not > explicitly converted. > Object arrays go to object arrays, datetime64 depends. In [10]: result_type(ones(1, dtype=object_), float32) Out[10]: dtype('O') Datetime64 seems to use the highest precision In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]') Out[12]: dtype(' in () ----> 1 result_type(ones(1, dtype='datetime64[D]'), float32) TypeError: invalid type promotion What would you like it to do? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Mar 6 08:48:01 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Mar 2015 08:48:01 -0500 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris wrote: > > > On Thu, Mar 5, 2015 at 10:02 PM, wrote: >> >> On Thu, Mar 5, 2015 at 12:33 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Mar 5, 2015 at 10:04 AM, Chris Barker >> > wrote: >> >> >> >> On Thu, Mar 5, 2015 at 8:42 AM, Benjamin Root wrote: >> >>> >> >>> dare I say... datetime64/timedelta64 support? >> >> >> >> >> >> well, the precision of those is 64 bits, yes? so if you asked for less >> >> than that, you'd still get a dt64. If you asked for 64 bits, you'd get >> >> it, >> >> if you asked for datetime128 -- what would you get??? >> >> >> >> a 128 bit integer? or an Exception, because there is no 128bit datetime >> >> dtype. >> >> >> >> But I think this is the same problem with any dtype -- if you ask for a >> >> precision that doesn't exist, you're going to get an error. >> >> >> >> Is there a more detailed description of the proposed feature anywhere? >> >> Do >> >> you specify a dtype as a precision? or jsut the precision, and let the >> >> dtype >> >> figure it out for itself, i.e.: >> >> >> >> precision=64 >> >> >> >> would give you a float64 if the passed in array was a float type, but a >> >> int64 if the passed in array was an int type, or a uint64 if the passed >> >> in >> >> array was a unsigned int type, etc..... >> >> >> >> But in the end, I wonder about the use case. I generaly use asarray >> >> one >> >> of two ways: >> >> >> >> Without a dtype -- to simple make sure I've got an ndarray of SOME >> >> dtype. >> >> >> >> or >> >> >> >> With a dtype - because I really care about the dtype -- usually because >> >> I >> >> need to pass it on to C code or something. >> >> >> >> I don't think I'd ever need at least some precision, but not care if I >> >> got >> >> more than that... >> > >> > >> > The main use that I want to cover is that float64 and complex128 have >> > the >> > same precision and it would be good if either is acceptable. Also, one >> > might just want either float32 or float64, not just one of the two. >> > Another >> > intent is to make the fewest possible copies. The determination of the >> > resulting type is made using the result_type function. >> >> >> How does this work for object arrays, or datetime? >> >> Can I specify at least float32 or float64, and it raises an exception >> if it cannot be converted? >> >> The problem we have in statsmodels is that pandas frequently uses >> object arrays and it messes up patsy or statsmodels if it's not >> explicitly converted. > > > Object arrays go to object arrays, datetime64 depends. > > In [10]: result_type(ones(1, dtype=object_), float32) > Out[10]: dtype('O') > > > Datetime64 seems to use the highest precision > > In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]') > Out[12]: dtype(' > In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]') > Out[13]: dtype(' > but doesn't convert to float > > In [11]: result_type(ones(1, dtype='datetime64[D]'), float32) > --------------------------------------------------------------------------- > TypeError Traceback (most recent call last) > in () > ----> 1 result_type(ones(1, dtype='datetime64[D]'), float32) > > TypeError: invalid type promotion > > What would you like it to do? Note: the dtype handling in statsmodels is still a mess, and we just plugged some of the worst cases. What we would need is asarray with at least a minimum precision (e.g. float32) and raise an exception if it's not numeric, like string, object, custom dtypes ... However, we need custom dtype handling in statsmodels anyway, so the enhancement to asarray with exceptions would mainly be convenient to get something to work with because pandas and numpy as now "object array friendly". I assume scipy also has insufficient checks for non-numeric dtypes, AFAIR. Josef > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From pav at iki.fi Fri Mar 6 08:55:17 2015 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 6 Mar 2015 13:55:17 +0000 (UTC) Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 References: Message-ID: Arnd Baecker web.de> writes: [clip] > Still I would have thought that this should be working out-of-the box, > i.e. without the pickle.loads trick? Pickle files should be considered incompatible between Python 2 and Python 3. Python 3 interprets all bytes objects saved by Python 2 as str and attempts to decode them under some unicode locale. The default locale is ASCII, so it will simply just fail in most cases if the files contain any binary data. Failing by default is also the right thing to do, since the saved bytes objects might actually represent strings in some locale, and ASCII is the safest guess. This behavior is that of Python's pickle module, and does not depend on Numpy. From ben.root at ou.edu Fri Mar 6 09:33:43 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 6 Mar 2015 09:33:43 -0500 Subject: [Numpy-discussion] Adding keyword to asarray and asanyarray. In-Reply-To: References: Message-ID: On Fri, Mar 6, 2015 at 7:59 AM, Charles R Harris wrote: > Datetime64 seems to use the highest precision > > In [12]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[us]') > Out[12]: dtype(' > In [13]: result_type(ones(1, dtype='datetime64[D]'), 'datetime64[Y]') > Out[13]: dtype(' Ah, yes, that's what I'm looking for. +1 from me to have this in asarray/asanyarray. Of course, there is always the usual caveats about converting your datetime data in this manner, but this would be helpful in many situations in writing functions that expect to deal with temporal data at the resolution of minutes or somesuch. Cheers! Ben Root -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnd.baecker at web.de Fri Mar 6 09:48:38 2015 From: arnd.baecker at web.de (Arnd Baecker) Date: Fri, 6 Mar 2015 15:48:38 +0100 (CET) Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: Message-ID: On Fri, 6 Mar 2015, Pauli Virtanen wrote: > Arnd Baecker web.de> writes: > [clip] >> Still I would have thought that this should be working out-of-the box, >> i.e. without the pickle.loads trick? > > Pickle files should be considered incompatible between Python 2 and Python 3. > > Python 3 interprets all bytes objects saved by Python 2 as str and attempts > to decode them under some unicode locale. The default locale is ASCII, so it > will simply just fail in most cases if the files contain any binary data. > > Failing by default is also the right thing to do, since the saved bytes > objects might actually represent strings in some locale, and ASCII is the > safest guess. > > This behavior is that of Python's pickle module, and does not depend on Numpy. Thank's a lot for the explanation! So what is then the recommded way to save data under python 2 so that they can still be loaded under python 3? For example using np.save with a list of arrays works fine either on python 2 or on python 3. However it does not work if one tries to open under python 3 a file generated before on python 2. (Again, because pickle is involved internally "python3.4/site-packages/numpy/lib/npyio.py", line 393, in load return format.read_array(fid) File "python34/lib/python3.4/site-packages/numpy/lib/format.py", line 602, in read_array array = pickle.load(fp) UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 ... Just to be clear: I don't want to beat a dead horse here - for my usage via pytables I was able to solve the loading of old files following Ryan's solutions. Personally I don't use .npy files. Maybe saving a list containing arrays is an unusual example ... Still, I am a little bit worried about backwards-compatibility: being able to load old data files is an important issue as by this it is possible to check whether current code still reproduces previously obtained (maybe also published) results. Best, Arnd From rnelsonchem at gmail.com Fri Mar 6 10:37:03 2015 From: rnelsonchem at gmail.com (Ryan Nelson) Date: Fri, 6 Mar 2015 10:37:03 -0500 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: Message-ID: Arnd, I can see where this is an issue. If you are trying to update your code for Py3, I still think that it would really help to add a version attribute of some sort to your new HDF files. You can then write a little check in your access code that looks for this variable. If it is not present, you know that it is an old file, and you can use the trick that I gave you. Otherwise, it will process the file as normal. It could even throw a little error saying that the file is outdated. You could write a small conversion script that could run through old files and reprocess them into the new format. Fortunately, Python is pretty good at automating tasks, even for hundreds of files :) It might be informative to ask at the PyTables list to see what they've done. The Pandas folks also do a lot with HDF files, and they have certainly worked their way through the Py2-3 transition. Also, because this is an issue with Python pickle, a quick note on SO might get some hits. I tried your script using a lists of list, rather than a list of arrays, and the same problem still persists, so as Pauli notes this is going to be a problem regardless of the type of attributes you set, I think your just going to have to hard code some kind of check in your code to switch behavior. I recently switched to using Py3 exclusively, and although it was painful at first, I'm quite happy with Py3 overall. I also use the Anaconda Python distribution, which makes it very easy to have Py2 and Py3 environments if you need to switch back and forth. Sorry if that doesn't help much. Just some thoughts from my recent conversion experiences. Ryan On Fri, Mar 6, 2015 at 9:48 AM, Arnd Baecker wrote: > On Fri, 6 Mar 2015, Pauli Virtanen wrote: > > > Arnd Baecker web.de> writes: > > [clip] > >> Still I would have thought that this should be working out-of-the box, > >> i.e. without the pickle.loads trick? > > > > Pickle files should be considered incompatible between Python 2 and > Python 3. > > > > Python 3 interprets all bytes objects saved by Python 2 as str and > attempts > > to decode them under some unicode locale. The default locale is ASCII, > so it > > will simply just fail in most cases if the files contain any binary data. > > > > Failing by default is also the right thing to do, since the saved bytes > > objects might actually represent strings in some locale, and ASCII is the > > safest guess. > > > > This behavior is that of Python's pickle module, and does not depend on > Numpy. > > Thank's a lot for the explanation! > > So what is then the recommded way to save data under python 2 so that > they can still be loaded under python 3? > > For example using np.save with a list of arrays works fine > either on python 2 or on python 3. > However it does not work if one tries to open under python 3 > a file generated before on python 2. > (Again, because pickle is involved internally > "python3.4/site-packages/numpy/lib/npyio.py", > line 393, in load return format.read_array(fid) > File "python34/lib/python3.4/site-packages/numpy/lib/format.py", > line 602, in read_array array = pickle.load(fp) > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 ... > > Just to be clear: I don't want to beat a dead horse here - for my usage > via pytables I was able to solve the loading of old files following > Ryan's solutions. Personally I don't use .npy files. > Maybe saving a list containing arrays is an unusual example ... > > Still, I am a little bit worried about backwards-compatibility: > being able to load old data files is an important issue > as by this it is possible to check whether current code still > reproduces previously obtained (maybe also published) results. > > Best, Arnd > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebix at sebix.at Fri Mar 6 12:34:27 2015 From: sebix at sebix.at (Sebastian) Date: Fri, 06 Mar 2015 18:34:27 +0100 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: Message-ID: <54F9E523.2040104@sebix.at> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi all, As this also affects .npy files, which uses pickle internally, why can't this be done by Numpy itself? This breaks backwards compatibility in a very bad way in my opinion. The company I worked for uses Numpy and consorts a lot and also has many data in .npy and pickle files. They currently work with 2.7, but I also tried to develop my programs to be compatible with Py 3. But this was not possible when it came to the point of dumping and loading npy files. I think this will be major reason why people won't take the step forward to Py3 and Numpy is not considered to be compatible to Python 3. just my 5 cents, Sebastian On 03/06/2015 04:37 PM, Ryan Nelson wrote: > Arnd, > > I can see where this is an issue. If you are trying to update your code for Py3, I still think that it would really help to add a version attribute of some sort to your new HDF files. You can then write a little check in your access code that looks for this variable. If it is not present, you know that it is an old file, and you can use the trick that I gave you. Otherwise, it will process the file as normal. It could even throw a little error saying that the file is outdated. You could write a small conversion script that could run through old files and reprocess them into the new format. Fortunately, Python is pretty good at automating tasks, even for hundreds of files :) > It might be informative to ask at the PyTables list to see what they've done. The Pandas folks also do a lot with HDF files, and they have certainly worked their way through the Py2-3 transition. Also, because this is an issue with Python pickle, a quick note on SO might get some hits. I tried your script using a lists of list, rather than a list of arrays, and the same problem still persists, so as Pauli notes this is going to be a problem regardless of the type of attributes you set, I think your just going to have to hard code some kind of check in your code to switch behavior. I recently switched to using Py3 exclusively, and although it was painful at first, I'm quite happy with Py3 overall. I also use the Anaconda Python distribution, which makes it very easy to have Py2 and Py3 environments if you need to switch back and forth. > Sorry if that doesn't help much. Just some thoughts from my recent conversion experiences. > > Ryan > > > > On Fri, Mar 6, 2015 at 9:48 AM, Arnd Baecker > wrote: > > On Fri, 6 Mar 2015, Pauli Virtanen wrote: > > > Arnd Baecker web.de > writes: > > [clip] > >> Still I would have thought that this should be working out-of-the box, > >> i.e. without the pickle.loads trick? > > > > Pickle files should be considered incompatible between Python 2 and Python 3. > > > > Python 3 interprets all bytes objects saved by Python 2 as str and attempts > > to decode them under some unicode locale. The default locale is ASCII, so it > > will simply just fail in most cases if the files contain any binary data. > > > > Failing by default is also the right thing to do, since the saved bytes > > objects might actually represent strings in some locale, and ASCII is the > > safest guess. > > > > This behavior is that of Python's pickle module, and does not depend on Numpy. > > Thank's a lot for the explanation! > > So what is then the recommded way to save data under python 2 so that > they can still be loaded under python 3? > > For example using np.save with a list of arrays works fine > either on python 2 or on python 3. > However it does not work if one tries to open under python 3 > a file generated before on python 2. > (Again, because pickle is involved internally > "python3.4/site-packages/numpy/lib/npyio.py", > line 393, in load return format.read_array(fid) > File "python34/lib/python3.4/site-packages/numpy/lib/format.py", > line 602, in read_array array = pickle.load(fp) > UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 ... > > Just to be clear: I don't want to beat a dead horse here - for my usage > via pytables I was able to solve the loading of old files following > Ryan's solutions. Personally I don't use .npy files. > Maybe saving a list containing arrays is an unusual example ... > > Still, I am a little bit worried about backwards-compatibility: > being able to load old data files is an important issue > as by this it is possible to check whether current code still > reproduces previously obtained (maybe also published) results. > > Best, Arnd > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- > python programming - mail server - photo - video - https://sebix.at > To verify my cryptographic signature or send me encrypted mails, get my > key at https://sebix.at/DC9B463B.asc and on public keyservers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJU+eUjAAoJEBn0X+vcm0Y7/WcQAK1iH3VHffrgEAFq7FU+aDw1 qAkKDcBi82aByr5v3S9zRRpcvYexk0tcNhQCoHUAGZHBCia86Ix1NLx8JT79SjFs wJMxYN8X8r8UcZEuhzw1tMJsflo7UY79CkkzIWPBbdtu5xiVCYkq3O8c3FU3NpZK 9xJPZ5W8+i9pkRDh6i36MuMtncfkbVMTkbo0Dp8DMkkRbQdvK8dfL3NJKZ8dRaIz zYOBBtgVMNcRFvwUnyE+lPYVp2bsDazIoa+6JIvlkWz86Rj6knC5Ehs6L710Bk1G LN0/taZhvRlImLrF8QLgZIhYCpXV45quc8dhkQDP6TOM+9j1LadvfstHPHlCfLBF N4VI7aWKXfAcShb8puaJdLz+F78+esJ7S0tWzRk6ZeJkoY1fBr3kvi3kvyUyy9g/ wV+MQnV1ioptmW+twnmo33AY4IA0qxjwB0uM0PcjjWZY7PrunnDtJRKDll+ruWEm UByUGtu881AbCMVnbTqpoJ+Ri12U0VR8gDn8zHVIUO6Q11v5cMuSOJTV0rls+n2E +7UZCL70UUUYBc//fclUvJ2MOxtfbRFqu3hvghCI5weJmAIn8r7O2D1/2mQvgjgn TqALF/zzJxoHS0EgjjbEsIMFkS1s8NiRJmPD3hWfOteyOogn3GHRYkaYov4YQGD3 YYfdjIWviS0meKMdQD59 =fI60 -----END PGP SIGNATURE----- From charlesr.harris at gmail.com Fri Mar 6 12:51:52 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 6 Mar 2015 10:51:52 -0700 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: <54F9E523.2040104@sebix.at> References: <54F9E523.2040104@sebix.at> Message-ID: On Fri, Mar 6, 2015 at 10:34 AM, Sebastian wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Hi all, > > As this also affects .npy files, which uses pickle internally, why can't > this be done by Numpy itself? This breaks backwards compatibility in a > very bad way in my opinion. > > The company I worked for uses Numpy and consorts a lot and also has many > data in .npy and pickle files. They currently work with 2.7, but I also > tried to develop my programs to be compatible with Py 3. But this was > not possible when it came to the point of dumping and loading npy files. > I think this will be major reason why people won't take the step forward > to Py3 and Numpy is not considered to be compatible to Python 3. > Are you suggesting adding a flag to the files to mark the python version in which they were created? The *.npy format is versioned, so something could probably be done with that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Fri Mar 6 13:00:05 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 6 Mar 2015 13:00:05 -0500 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> Message-ID: A slightly different way to look at this is one of sharing data. If I am working on a system with 3.4 and I want to share data with others who may be using a mix of 2.7 and 3.3 systems, this problem makes npz format much less attractive. Ben Root On Fri, Mar 6, 2015 at 12:51 PM, Charles R Harris wrote: > > > On Fri, Mar 6, 2015 at 10:34 AM, Sebastian wrote: > >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> Hi all, >> >> As this also affects .npy files, which uses pickle internally, why can't >> this be done by Numpy itself? This breaks backwards compatibility in a >> very bad way in my opinion. >> >> The company I worked for uses Numpy and consorts a lot and also has many >> data in .npy and pickle files. They currently work with 2.7, but I also >> tried to develop my programs to be compatible with Py 3. But this was >> not possible when it came to the point of dumping and loading npy files. >> I think this will be major reason why people won't take the step forward >> to Py3 and Numpy is not considered to be compatible to Python 3. >> > > Are you suggesting adding a flag to the files to mark the python version > in which they were created? The *.npy format is versioned, so something > could probably be done with that. > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Mar 6 15:09:33 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 6 Mar 2015 12:09:33 -0800 Subject: [Numpy-discussion] appveyor CI In-Reply-To: References: <87wq2vs02q.fsf@berkeley.edu> Message-ID: On Thu, Mar 5, 2015 at 5:07 PM, Charles R Harris wrote: > Do line endings in the scripts matter? > I have no idea if powershell cares about line endings, but if you are using git, then you'll want to make sure that your repo is properly configured to normalize line endings -- then there should be no problems. And you realy want that anyway for any multi-platform project. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri Mar 6 15:23:46 2015 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 06 Mar 2015 22:23:46 +0200 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> Message-ID: 06.03.2015, 20:00, Benjamin Root kirjoitti: > A slightly different way to look at this is one of sharing data. If I am > working on a system with 3.4 and I want to share data with others who may > be using a mix of 2.7 and 3.3 systems, this problem makes npz format much > less attractive. pickle is used in npy files only if there are object arrays in them. Of course, savez could just decline saving object arrays. From efiring at hawaii.edu Fri Mar 6 15:43:38 2015 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 06 Mar 2015 10:43:38 -1000 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> Message-ID: <54FA117A.5000904@hawaii.edu> On 2015/03/06 10:23 AM, Pauli Virtanen wrote: > 06.03.2015, 20:00, Benjamin Root kirjoitti: >> A slightly different way to look at this is one of sharing data. If I am >> working on a system with 3.4 and I want to share data with others who may >> be using a mix of 2.7 and 3.3 systems, this problem makes npz format much >> less attractive. > > pickle is used in npy files only if there are object arrays in them. > Of course, savez could just decline saving object arrays. Or issue a prominent warning. Eric From pav at iki.fi Fri Mar 6 18:20:09 2015 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 07 Mar 2015 01:20:09 +0200 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: <54FA117A.5000904@hawaii.edu> References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> Message-ID: 06.03.2015, 22:43, Eric Firing kirjoitti: > On 2015/03/06 10:23 AM, Pauli Virtanen wrote: >> 06.03.2015, 20:00, Benjamin Root kirjoitti: >>> A slightly different way to look at this is one of sharing data. If I am >>> working on a system with 3.4 and I want to share data with others who may >>> be using a mix of 2.7 and 3.3 systems, this problem makes npz format much >>> less attractive. >> >> pickle is used in npy files only if there are object arrays in them. >> Of course, savez could just decline saving object arrays. > > Or issue a prominent warning. https://github.com/numpy/numpy/pull/5641 From pav at iki.fi Fri Mar 6 18:21:09 2015 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 07 Mar 2015 01:21:09 +0200 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> Message-ID: 06.03.2015, 22:23, Pauli Virtanen kirjoitti: > 06.03.2015, 20:00, Benjamin Root kirjoitti: >> A slightly different way to look at this is one of sharing data. If I am >> working on a system with 3.4 and I want to share data with others who may >> be using a mix of 2.7 and 3.3 systems, this problem makes npz format much >> less attractive. > > pickle is used in npy files only if there are object arrays in them. > Of course, savez could just decline saving object arrays. np.load is missing the Py2-3 workaround flags that pickle.load has, probably could be added: https://github.com/numpy/numpy/pull/5640 From jtaylor.debian at googlemail.com Fri Mar 6 18:29:04 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sat, 07 Mar 2015 00:29:04 +0100 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> Message-ID: <54FA3840.3020602@googlemail.com> On 07.03.2015 00:20, Pauli Virtanen wrote: > 06.03.2015, 22:43, Eric Firing kirjoitti: >> On 2015/03/06 10:23 AM, Pauli Virtanen wrote: >>> 06.03.2015, 20:00, Benjamin Root kirjoitti: >>>> A slightly different way to look at this is one of sharing data. If I am >>>> working on a system with 3.4 and I want to share data with others who may >>>> be using a mix of 2.7 and 3.3 systems, this problem makes npz format much >>>> less attractive. >>> >>> pickle is used in npy files only if there are object arrays in them. >>> Of course, savez could just decline saving object arrays. >> >> Or issue a prominent warning. > > https://github.com/numpy/numpy/pull/5641 > I think the ship for a warning has long sailed. At this point its probably more an annoyance for python3 users and will not prevent many more python2 users from saving files that can't be loaded into python3. From charlesr.harris at gmail.com Fri Mar 6 18:40:27 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 6 Mar 2015 16:40:27 -0700 Subject: [Numpy-discussion] Numpy 1.10 Message-ID: Hi All, Time to start thinking about numpy 1.10. At the moment there are 21 blockers and 93 PRs. it would be good if we could prioritize the blockers and the PRs. It might be a good idea to also look at closing some of the aging PRs. These days it seems to be normal to have 20 or so PRs in progress, it it would be nice if we could get the total down around 40-50. I don't see a numpy 1.10 before July, but it is going to take a few months to get things in order, so now is a good time to start. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Fri Mar 6 20:06:20 2015 From: efiring at hawaii.edu (Eric Firing) Date: Fri, 06 Mar 2015 15:06:20 -1000 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: <54FA3840.3020602@googlemail.com> References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> <54FA3840.3020602@googlemail.com> Message-ID: <54FA4F0C.6090406@hawaii.edu> On 2015/03/06 1:29 PM, Julian Taylor wrote: > I think the ship for a warning has long sailed. At this point its > probably more an annoyance for python3 users and will not prevent many > more python2 users from saving files that can't be loaded into python3. The point of a warning is that anything that relies on pickles is fundamentally unreliable in the long term. It's potentially a surprise that the npz format relies on pickles. From pav at iki.fi Sat Mar 7 04:54:06 2015 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 07 Mar 2015 11:54:06 +0200 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: <54FA3840.3020602@googlemail.com> References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> <54FA3840.3020602@googlemail.com> Message-ID: 07.03.2015, 01:29, Julian Taylor kirjoitti: > On 07.03.2015 00:20, Pauli Virtanen wrote: >> 06.03.2015, 22:43, Eric Firing kirjoitti: >>> On 2015/03/06 10:23 AM, Pauli Virtanen wrote: >>>> 06.03.2015, 20:00, Benjamin Root kirjoitti: >>>>> A slightly different way to look at this is one of sharing data. If I am >>>>> working on a system with 3.4 and I want to share data with others who may >>>>> be using a mix of 2.7 and 3.3 systems, this problem makes npz format much >>>>> less attractive. >>>> >>>> pickle is used in npy files only if there are object arrays in them. >>>> Of course, savez could just decline saving object arrays. >>> >>> Or issue a prominent warning. >> >> https://github.com/numpy/numpy/pull/5641 >> > > I think the ship for a warning has long sailed. At this point its > probably more an annoyance for python3 users and will not prevent many > more python2 users from saving files that can't be loaded into python3. How about an extra use_pickle=True kwarg that can be used to disable using pickle altogether in these routines? Another reason to do this is arbitrary code execution when loading pickles: https://www.cs.jhu.edu/~s/musings/pickle.html Easily demonstrated also with npy files (loading this file will only print something unexpected, nothing more malicious): http://pav.iki.fi/tmp/unexpected.npy From robert.kern at gmail.com Sat Mar 7 05:23:31 2015 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 7 Mar 2015 10:23:31 +0000 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> <54FA3840.3020602@googlemail.com> Message-ID: On Sat, Mar 7, 2015 at 9:54 AM, Pauli Virtanen wrote: > How about an extra use_pickle=True kwarg that can be used to disable > using pickle altogether in these routines? If we do, I'd vastly prefer `forbid_pickle=False`. The use_pickle spelling suggests that you are asking it to use pickle when it otherwise wouldn't, which is not the intention. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sat Mar 7 06:26:07 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sat, 07 Mar 2015 12:26:07 +0100 Subject: [Numpy-discussion] numpy pickling problem - python 2 vs. python 3 In-Reply-To: References: <54F9E523.2040104@sebix.at> <54FA117A.5000904@hawaii.edu> <54FA3840.3020602@googlemail.com> Message-ID: <1425727567.17341.6.camel@sipsolutions.net> On Sa, 2015-03-07 at 10:23 +0000, Robert Kern wrote: > On Sat, Mar 7, 2015 at 9:54 AM, Pauli Virtanen wrote: > > > How about an extra use_pickle=True kwarg that can be used to disable > > using pickle altogether in these routines? > > If we do, I'd vastly prefer `forbid_pickle=False`. The use_pickle > spelling suggests that you are asking it to use pickle when it > otherwise wouldn't, which is not the intention. > I like the idea, at least for loading. Could also call it `allow_objects` with an explanation in the documentation. I would consider deprecating it and not allowing pickles as default, but I am not sure that is not going too far. However, I think we should be able to safely share data using npy. - Sebastian > -- > Robert Kern > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From dineshbvadhia at hotmail.com Sat Mar 7 16:02:03 2015 From: dineshbvadhia at hotmail.com (Dinesh Vadhia) Date: Sat, 7 Mar 2015 13:02:03 -0800 Subject: [Numpy-discussion] numpy array casting ruled not safe Message-ID: This was originally posted on SO (https://stackoverflow.com/questions/28853740/numpy-array-casting-ruled-not-safe) and it was suggested it is probably a bug in numpy.take. Python 2.7.8 |Anaconda 2.1.0 (32-bit)| (default, Jul 2 2014, 15:13:35) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import numpy >>> numpy.__version__ '1.9.2' >>> a = numpy.array([9, 7, 5, 4, 3, 1], dtype=numpy.uint32) >>> b = numpy.array([1, 3], dtype=numpy.uint32) >>> c = a.take(b) Traceback (most recent call last): File "", line 1, in c = a.take(b) TypeError: Cannot cast array data from dtype('uint32') to dtype('int32') according to the rule 'safe' -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 7 16:45:50 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Mar 2015 14:45:50 -0700 Subject: [Numpy-discussion] numpy array casting ruled not safe In-Reply-To: References: Message-ID: On Sat, Mar 7, 2015 at 2:02 PM, Dinesh Vadhia wrote: > This was originally posted on SO ( > https://stackoverflow.com/questions/28853740/numpy-array-casting-ruled-not-safe) > and it was suggested it is probably a bug in numpy.take. > > Python 2.7.8 |Anaconda 2.1.0 (32-bit)| (default, Jul 2 2014, 15:13:35) > [MSC v.1500 32 bit (Intel)] on win32 > Type "copyright", "credits" or "license()" for more information. > > >>> import numpy > >>> numpy.__version__ > '1.9.2' > > >>> a = numpy.array([9, 7, 5, 4, 3, 1], dtype=numpy.uint32) > >>> b = numpy.array([1, 3], dtype=numpy.uint32) > >>> c = a.take(b) > > Traceback (most recent call last): > File "", line 1, in > c = a.take(b) > TypeError: Cannot cast array data from dtype('uint32') to dtype('int32') > according to the rule 'safe' > This actually looks correct for 32-bit windows. Numpy indexes with a signed type big enough to hold a pointer to void, which in this case is an int32, and the uint32 cannot be safely cast to that type. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Mar 7 16:52:16 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 7 Mar 2015 14:52:16 -0700 Subject: [Numpy-discussion] numpy array casting ruled not safe In-Reply-To: References: Message-ID: On Sat, Mar 7, 2015 at 2:45 PM, Charles R Harris wrote: > > > On Sat, Mar 7, 2015 at 2:02 PM, Dinesh Vadhia > wrote: > >> This was originally posted on SO ( >> https://stackoverflow.com/questions/28853740/numpy-array-casting-ruled-not-safe) >> and it was suggested it is probably a bug in numpy.take. >> >> Python 2.7.8 |Anaconda 2.1.0 (32-bit)| (default, Jul 2 2014, 15:13:35) >> [MSC v.1500 32 bit (Intel)] on win32 >> Type "copyright", "credits" or "license()" for more information. >> >> >>> import numpy >> >>> numpy.__version__ >> '1.9.2' >> >> >>> a = numpy.array([9, 7, 5, 4, 3, 1], dtype=numpy.uint32) >> >>> b = numpy.array([1, 3], dtype=numpy.uint32) >> >>> c = a.take(b) >> >> Traceback (most recent call last): >> File "", line 1, in >> c = a.take(b) >> TypeError: Cannot cast array data from dtype('uint32') to dtype('int32') >> according to the rule 'safe' >> > > This actually looks correct for 32-bit windows. Numpy indexes with a > signed type big enough to hold a pointer to void, which in this case is an > int32, and the uint32 cannot be safely cast to that type. > > Chuck > I note that on SO Jaime made the suggestion that take use unsafe casting and throw an error on out of bounds indexes. That sounds reasonable, although for sufficiently large integer types an index could wrap around to a good value. Maybe make it work only for npy_uintp. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Sat Mar 7 21:21:21 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sat, 7 Mar 2015 18:21:21 -0800 Subject: [Numpy-discussion] numpy array casting ruled not safe In-Reply-To: References: Message-ID: On Sat, Mar 7, 2015 at 1:52 PM, Charles R Harris wrote: > > > On Sat, Mar 7, 2015 at 2:45 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sat, Mar 7, 2015 at 2:02 PM, Dinesh Vadhia >> wrote: >> >>> This was originally posted on SO ( >>> https://stackoverflow.com/questions/28853740/numpy-array-casting-ruled-not-safe) >>> and it was suggested it is probably a bug in numpy.take. >>> >>> Python 2.7.8 |Anaconda 2.1.0 (32-bit)| (default, Jul 2 2014, 15:13:35) >>> [MSC v.1500 32 bit (Intel)] on win32 >>> Type "copyright", "credits" or "license()" for more information. >>> >>> >>> import numpy >>> >>> numpy.__version__ >>> '1.9.2' >>> >>> >>> a = numpy.array([9, 7, 5, 4, 3, 1], dtype=numpy.uint32) >>> >>> b = numpy.array([1, 3], dtype=numpy.uint32) >>> >>> c = a.take(b) >>> >>> Traceback (most recent call last): >>> File "", line 1, in >>> c = a.take(b) >>> TypeError: Cannot cast array data from dtype('uint32') to dtype('int32') >>> according to the rule 'safe' >>> >> >> This actually looks correct for 32-bit windows. Numpy indexes with a >> signed type big enough to hold a pointer to void, which in this case is an >> int32, and the uint32 cannot be safely cast to that type. >> >> Chuck >> > > I note that on SO Jaime made the suggestion that take use unsafe casting > and throw an error on out of bounds indexes. That sounds reasonable, > although for sufficiently large integer types an index could wrap around to > a good value. Maybe make it work only for npy_uintp. > > Chuck > It is mostly about consistency, and having take match what indexing already does, which is to unsafely cast all integers: In [11]: np.arange(10)[np.uint64(2**64-1)] Out[11]: 9 I think no one has ever complained about that obviously wrong behavior, but people do get annoyed if they cannot use their perfectly valid uint64 array because we want to protect them from themselves. Sebastian has probably given this more thought than anyone else, it would be interesting to hear his thoughts on this. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Mar 8 06:49:20 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 08 Mar 2015 11:49:20 +0100 Subject: [Numpy-discussion] numpy array casting ruled not safe In-Reply-To: References: Message-ID: <1425811760.21916.16.camel@sipsolutions.net> On Sa, 2015-03-07 at 18:21 -0800, Jaime Fern?ndez del R?o wrote: > > > > > I note that on SO Jaime made the suggestion that take use > unsafe casting and throw an error on out of bounds indexes. > That sounds reasonable, although for sufficiently large > integer types an index could wrap around to a good value. > Maybe make it work only for npy_uintp. > > > Chuck > > > It is mostly about consistency, and having take match what indexing > already does, which is to unsafely cast all integers: > > > In [11]: np.arange(10)[np.uint64(2**64-1)] > Out[11]: 9 > > > I think no one has ever complained about that obviously wrong > behavior, but people do get annoyed if they cannot use their perfectly > valid uint64 array because we want to protect them from themselves. > Sebastian has probably given this more thought than anyone else, it > would be interesting to hear his thoughts on this. > Not really, there was no change in behaviour for arrays here. Apparently though (which I did not realize), there was a change for numpy scalars/0-d arrays. Of course I think ideally "same_type" casting would raise an error or at least warn on out of bounds integers, but we do not have a mechanism for that. We could fix this, I think Jaime you had thought about that at some point? But it would require loop specializations for every integer type. So, I am not sure what to prefer, but for the user indexing with unsigned integers has to keep working without explicit cast. Of course the fact that it is dangerous, is bothering me a bit, even if a dangerous wrap-around seems unlikely in practice. - Sebastian > > Jaime > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From jtaylor.debian at googlemail.com Sun Mar 8 08:35:21 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Sun, 08 Mar 2015 13:35:21 +0100 Subject: [Numpy-discussion] numpy array casting ruled not safe In-Reply-To: <1425811760.21916.16.camel@sipsolutions.net> References: <1425811760.21916.16.camel@sipsolutions.net> Message-ID: <54FC4209.1080607@googlemail.com> On 08.03.2015 11:49, Sebastian Berg wrote: > On Sa, 2015-03-07 at 18:21 -0800, Jaime Fern?ndez del R?o wrote: > >> >> >> >> >> I note that on SO Jaime made the suggestion that take use >> unsafe casting and throw an error on out of bounds indexes. >> That sounds reasonable, although for sufficiently large >> integer types an index could wrap around to a good value. >> Maybe make it work only for npy_uintp. >> >> >> Chuck >> >> >> It is mostly about consistency, and having take match what indexing >> already does, which is to unsafely cast all integers: >> >> >> In [11]: np.arange(10)[np.uint64(2**64-1)] >> Out[11]: 9 >> >> >> I think no one has ever complained about that obviously wrong >> behavior, but people do get annoyed if they cannot use their perfectly >> valid uint64 array because we want to protect them from themselves. >> Sebastian has probably given this more thought than anyone else, it >> would be interesting to hear his thoughts on this. >> > > Not really, there was no change in behaviour for arrays here. Apparently > though (which I did not realize), there was a change for numpy > scalars/0-d arrays. Of course I think ideally "same_type" casting would > raise an error or at least warn on out of bounds integers, but we do not > have a mechanism for that. > > We could fix this, I think Jaime you had thought about that at some > point? But it would require loop specializations for every integer type. > > So, I am not sure what to prefer, but for the user indexing with > unsigned integers has to keep working without explicit cast. Of course > the fact that it is dangerous, is bothering me a bit, even if a > dangerous wrap-around seems unlikely in practice. > I was working on supporting arbitrary integer types as index without casting. This would have a few advantages, you can save memory without sacrificing indexing performance by using smaller integers and you can skip the negative index wraparound step for unsigned types. But it does add quite a bit of code bloat that is essentially a micro-optimization. To make it really useful one also needs to adapt other functions like where, arange, meshgrid, indices etc. to have an option to return the smallest integer type that is sufficient for the index array. From sdpan21 at gmail.com Sun Mar 8 16:47:43 2015 From: sdpan21 at gmail.com (Dp Docs) Date: Mon, 9 Mar 2015 02:17:43 +0530 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" Message-ID: Hi all, I am a CS 3rd Undergrad. Student from an Indian Institute (III T). I believe I am good in Programming languages like C/C++, Python as I have already done Some Projects using these language as a part of my academics. I really like Coding (Competitive as well as development). I really want to get involved in Numpy Development Project and want to take "Vector math library integration" as a part of my project. I want to here any idea from your side for this project. Thanks For your time for reading this email and responding back. My IRCnickname: dp Real Name: Durgesh Pandey. From ralf.gommers at gmail.com Sun Mar 8 17:43:51 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 8 Mar 2015 22:43:51 +0100 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris wrote: > Hi All, > > Time to start thinking about numpy 1.10. > Sounds good. Do we have a volunteer for release manager already? > At the moment there are 21 blockers and 93 PRs. it would be good if we > could prioritize the blockers and the PRs. > Many of those don't look like real blockers. I started labeling and renaming PRs so one can actually see quickly what PRs are about. > It might be a good idea to also look at closing some of the aging PRs. > +1 (but let's start gently) > These days it seems to be normal to have 20 or so PRs in progress, it it > would be nice if we could get the total down around 40-50. > > I don't see a numpy 1.10 before July, but it is going to take a few months > to get things in order, so now is a good time to start. > Sooner would be possible probably. Releasing a bit more frequent would be useful. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.dipierro at gmail.com Mon Mar 9 14:30:28 2015 From: massimo.dipierro at gmail.com (Massimo DiPierro) Date: Mon, 9 Mar 2015 13:30:28 -0500 Subject: [Numpy-discussion] DePy 2015 Message-ID: Hello everybody, We are organizing a new conference on Python in Chicago (May 29-30) with focus on Numerical Applications, Machine Learning and Web: http://mdp.cdm.depaul.edu/DePy2015/default/index We are looking for participants, speakers, and sponsors. Please register and submit a submit a talk/tutorial proposal. Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Mon Mar 9 16:34:38 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Mon, 9 Mar 2015 13:34:38 -0700 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: I feel your pain. Making it worse, numpy.random.lognormal takes "mean" and "sigma" as input. If there's ever a backwards incompatible release, I hope these things will be cleared up. On Tue, Mar 3, 2015 at 9:59 PM, Daniel Sank wrote: > Sturia, > > > Change the name of keyword arguments? > > This would be evil. > > Yes, I see this now. It is really a shame these were defined as keyword > arguments. I've shown the docstring to a few people and so far they all > agree that it is unnecessarily convoluted. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 9 18:16:38 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 9 Mar 2015 23:16:38 +0100 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Sun, Mar 8, 2015 at 10:43 PM, Ralf Gommers wrote: > > > On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Time to start thinking about numpy 1.10. >> > > Sounds good. Do we have a volunteer for release manager already? > > >> At the moment there are 21 blockers and 93 PRs. it would be good if we >> could prioritize the blockers and the PRs. >> > > Many of those don't look like real blockers. I started labeling and > renaming PRs so one can actually see quickly what PRs are about. > The labeling/renaming is finished. Made things more colorful at least:) I'll have a go at the open numpy.ma PRs. I did learn from that exercise that we really would benefit from a dedicated maintainer for numpy.ma. If anyone would be interested in taking on that role, please speak up! Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Mar 9 19:31:06 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Mar 2015 17:31:06 -0600 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Mon, Mar 9, 2015 at 4:16 PM, Ralf Gommers wrote: > > > On Sun, Mar 8, 2015 at 10:43 PM, Ralf Gommers > wrote: > >> >> >> On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> Time to start thinking about numpy 1.10. >>> >> >> Sounds good. Do we have a volunteer for release manager already? >> >> >>> At the moment there are 21 blockers and 93 PRs. it would be good if we >>> could prioritize the blockers and the PRs. >>> >> >> Many of those don't look like real blockers. I started labeling and >> renaming PRs so one can actually see quickly what PRs are about. >> > > The labeling/renaming is finished. Made things more colorful at least:) > I'll have a go at the open numpy.ma PRs. > Thanks Ralf. I need to add `fixing alignment` to the blockers. 1.9.2 is fixed, but IIRC, that fix is not in devel. > > I did learn from that exercise that we really would benefit from a > dedicated maintainer for numpy.ma. If anyone would be interested in > taking on that role, please speak up! > Yes, that would be very helpful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Mar 9 19:33:12 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 9 Mar 2015 17:33:12 -0600 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Mar 9, 2015 at 2:34 PM, Paul Hobson wrote: > I feel your pain. Making it worse, numpy.random.lognormal takes "mean" and > "sigma" as input. If there's ever a backwards incompatible release, I hope > these things will be cleared up. > There is a numpy 2.0 milestone ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Mar 10 12:33:10 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Mar 2015 12:33:10 -0400 Subject: [Numpy-discussion] MKL ValueError: On entry to DLASCL parameter number 5 had an illegal value Message-ID: I got a illegal value message using MKL on Windows 64 while running the statsmodels test suite. Kevin is getting the same with more information, which indicates that it might be numpy.linalg.svd https://github.com/statsmodels/statsmodels/issues/2308#issuecomment-78086656 Is this serious? I'm just setting up a new computer and haven't investigated yet. Given the name of the test class, this might be a warning for an inf or nan, then it would be our problem in statsmodels Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Mar 10 13:20:50 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 10 Mar 2015 11:20:50 -0600 Subject: [Numpy-discussion] MKL ValueError: On entry to DLASCL parameter number 5 had an illegal value In-Reply-To: References: Message-ID: On Tue, Mar 10, 2015 at 10:33 AM, wrote: > > I got a illegal value message using MKL on Windows 64 while running the > statsmodels test suite. > > Kevin is getting the same with more information, which indicates that it > might be numpy.linalg.svd > > https://github.com/statsmodels/statsmodels/issues/2308#issuecomment-78086656 > > Is this serious? > > I'm just setting up a new computer and haven't investigated yet. > Given the name of the test class, this might be a warning for an inf or > nan, then it would be our problem in statsmodels > > What version of Numpy? Does this also happen if you aren't using MKL? The dlascl reference is here. It is possible that the problem being solved is numerically sensitive and that the treatment of underflow/overflow due to compiler flags might be producing zeros or infs. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Mar 10 13:50:35 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Mar 2015 13:50:35 -0400 Subject: [Numpy-discussion] MKL ValueError: On entry to DLASCL parameter number 5 had an illegal value In-Reply-To: References: Message-ID: On Tue, Mar 10, 2015 at 1:20 PM, Charles R Harris wrote: > > > On Tue, Mar 10, 2015 at 10:33 AM, wrote: > >> >> I got a illegal value message using MKL on Windows 64 while running the >> statsmodels test suite. >> >> Kevin is getting the same with more information, which indicates that it >> might be numpy.linalg.svd >> >> https://github.com/statsmodels/statsmodels/issues/2308#issuecomment-78086656 >> >> Is this serious? >> >> I'm just setting up a new computer and haven't investigated yet. >> Given the name of the test class, this might be a warning for an inf or >> nan, then it would be our problem in statsmodels >> >> > What version of Numpy? Does this also happen if you aren't using MKL? The dlascl > reference > is > here. It is possible that the problem being solved is numerically sensitive > and that the treatment of underflow/overflow due to compiler flags might be > producing zeros or infs. > nosetests says NumPy version 1.9.2rc1 Version is from winpython which I guess uses Gohlke binaries. It didn't show up with official numpy in 32bit python, and it doesn't show up on TravisCI. In contrast to Kevin, I don't get any info or test failure M:\Notes>nosetests --pdb --pdb-failures -v statsmodels.base.tests.test_data.Test MissingArray statsmodels.base.tests.test_data.TestMissingArray.test_raise_no_missing ... ok statsmodels.base.tests.test_data.TestMissingArray.test_raise ... ok statsmodels.base.tests.test_data.TestMissingArray.test_drop ... ok statsmodels.base.tests.test_data.TestMissingArray.test_none ... Intel MKL ERROR: Parameter 4 was incorrect on entry to DLASCL. Intel MKL ERROR: Parameter 5 was incorrect on entry to DLASCL. Intel MKL ERROR: Parameter 4 was incorrect on entry to DLASCL. Intel MKL ERROR: Parameter 5 was incorrect on entry to DLASCL. ok statsmodels.base.tests.test_data.TestMissingArray.test_endog_only_raise ... ok statsmodels.base.tests.test_data.TestMissingArray.test_endog_only_drop ... ok statsmodels.base.tests.test_data.TestMissingArray.test_mv_endog ... ok statsmodels.base.tests.test_data.TestMissingArray.test_extra_kwargs_2d ... ok statsmodels.base.tests.test_data.TestMissingArray.test_extra_kwargs_1d ... ok ---------------------------------------------------------------------- Ran 9 tests in 0.049s OK However based on the testcase, it looks like we call np.linalg.svd with an array that contains nans, DLASCL might then be used internally in the svd calculations Josef > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pmhobson at gmail.com Tue Mar 10 14:15:09 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 10 Mar 2015 11:15:09 -0700 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mon, Mar 9, 2015 at 4:33 PM, Charles R Harris wrote: > > > On Mon, Mar 9, 2015 at 2:34 PM, Paul Hobson wrote: > >> I feel your pain. Making it worse, numpy.random.lognormal takes "mean" >> and "sigma" as input. If there's ever a backwards incompatible release, I >> hope these things will be cleared up. >> > > There is a numpy 2.0 milestone ;) > > Is it worth submitting PRs against the existing 2.X branch or is that so far away that the can should be kicked down the road? -Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Tue Mar 10 14:22:36 2015 From: njs at pobox.com (Nathaniel Smith) Date: Tue, 10 Mar 2015 11:22:36 -0700 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: On Mar 10, 2015 11:15 AM, "Paul Hobson" wrote: > > > On Mon, Mar 9, 2015 at 4:33 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: >> >> >> >> On Mon, Mar 9, 2015 at 2:34 PM, Paul Hobson wrote: >>> >>> I feel your pain. Making it worse, numpy.random.lognormal takes "mean" and "sigma" as input. If there's ever a backwards incompatible release, I hope these things will be cleared up. >> >> >> There is a numpy 2.0 milestone ;) >> > > Is it worth submitting PRs against the existing 2.X branch or is that so far away that the can should be kicked down the road? Not sure what you mean by "the existing 2.X branch" (does such a thing exist somewhere?), but yeah, don't submit PRs like that. Best case they'd bit rot before we ever get around to 2.0, worst case 2.0 may never happen. (What, you liked python 2 -> 3 so much you want to go through that again?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Tue Mar 10 14:25:21 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Tue, 10 Mar 2015 19:25:21 +0100 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: On 09/03/15 21:34, Paul Hobson wrote: > I feel your pain. Making it worse, numpy.random.lognormal takes "mean" > and "sigma" as input. If there's ever a backwards incompatible release, > I hope these things will be cleared up. The question is how... The fix is obvious, but the consequences are unacceptable. Sturla From sebastian at sipsolutions.net Tue Mar 10 14:32:39 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 10 Mar 2015 19:32:39 +0100 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: <1426012359.26853.0.camel@sipsolutions.net> On Di, 2015-03-10 at 11:22 -0700, Nathaniel Smith wrote: > On Mar 10, 2015 11:15 AM, "Paul Hobson" wrote: > > > > > > On Mon, Mar 9, 2015 at 4:33 PM, Charles R Harris > wrote: > >> > >> > >> > >> On Mon, Mar 9, 2015 at 2:34 PM, Paul Hobson > wrote: > >>> > >>> I feel your pain. Making it worse, numpy.random.lognormal takes > "mean" and "sigma" as input. If there's ever a backwards incompatible > release, I hope these things will be cleared up. > >> > >> > >> There is a numpy 2.0 milestone ;) > >> > > > > Is it worth submitting PRs against the existing 2.X branch or is > that so far away that the can should be kicked down the road? > > Not sure what you mean by "the existing 2.X branch" (does such a thing > exist somewhere?), but yeah, don't submit PRs like that. Best case > they'd bit rot before we ever get around to 2.0, worst case 2.0 may > never happen. (What, you liked python 2 -> 3 so much you want to go > through that again?) > We could try to maintain a list of things like this for others like blaze to not fall into the same pits. But then I this is likely not quite of that caliber ;). > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From pmhobson at gmail.com Tue Mar 10 18:20:26 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Tue, 10 Mar 2015 15:20:26 -0700 Subject: [Numpy-discussion] Would like to patch docstring for numpy.random.normal In-Reply-To: References: <1557076892447072653.462812sturla.molden-gmail.com@news.gmane.org> Message-ID: On Tue, Mar 10, 2015 at 11:22 AM, Nathaniel Smith wrote: > On Mar 10, 2015 11:15 AM, "Paul Hobson" wrote: > > > > > > On Mon, Mar 9, 2015 at 4:33 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> > >> > >> > >> On Mon, Mar 9, 2015 at 2:34 PM, Paul Hobson wrote: > >>> > >>> I feel your pain. Making it worse, numpy.random.lognormal takes "mean" > and "sigma" as input. If there's ever a backwards incompatible release, I > hope these things will be cleared up. > >> > >> > >> There is a numpy 2.0 milestone ;) > >> > > > > Is it worth submitting PRs against the existing 2.X branch or is that so > far away that the can should be kicked down the road? > > Not sure what you mean by "the existing 2.X branch" (does such a thing > exist somewhere?), but yeah, don't submit PRs like that. Best case they'd > bit rot before we ever get around to 2.0, worst case 2.0 may never happen. > (What, you liked python 2 -> 3 so much you want to go through that again?) > It's been a while, but last time I build numpy from the master on github, numpy.__version__ came back as 2.dev or something like that. So I always assumed there was a 2.X branch. It's been a while since I've built numpy, but clearly I'm mistaken. -p -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Mar 11 10:22:43 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Mar 2015 15:22:43 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: There are several vector math libraries NumPy could use, e.g. MKL/VML, Apple Accelerate (vecLib), ACML, and probably others. They all suffer from requiring dense arrays and specific array alignments, whereas NumPy arrays have very flexible strides and flexible alignment. NumPy also has ufuncs and gufuncs as a complicating factor. There are at least two ways to proceed here. One is to only use vector math when strides and alignment allow it. The other is to build a vector math library specifically for NumPy arrays and (g)ufuncs. The latter you will most likely not be able to do in a summer. You should also consider Numba and Numexpr. They have some support for vector math libraries too. Sturla On 08/03/15 21:47, Dp Docs wrote: > Hi all, > I am a CS 3rd Undergrad. Student from an Indian Institute (III T). I > believe I am good in Programming languages like C/C++, Python as I > have already done Some Projects using these language as a part of my > academics. I really like Coding (Competitive as well as development). > I really want to get involved in Numpy Development Project and want to > take "Vector math library integration" as a part of my project. I > want to here any idea from your side for this project. > Thanks For your time for reading this email and responding back. > > My IRCnickname: dp > > Real Name: Durgesh Pandey. > From sdpan21 at gmail.com Wed Mar 11 11:51:29 2015 From: sdpan21 at gmail.com (Dp Docs) Date: Wed, 11 Mar 2015 21:21:29 +0530 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden wrote: > ?Hi sturla, Thanks for suggestion.? > There are several vector math libraries NumPy could use, e.g. MKL/VML, > Apple Accelerate (vecLib), ACML, and probably others. ?Are these libraries fast enough in comparison to C maths libraries?? > They all suffer from requiring dense arrays and specific array > alignments, whereas NumPy arrays have very flexible strides and flexible > alignment. ?>? NumPy also has ufuncs and gufuncs as a complicating factor. ?I don't think the project is supposed to modify the existing functionality as whenever the Faster libraries ?will be unavailable, it should use the default libraries. > > There are at least two ways to proceed here. One is to only use vector > math when strides and alignment allow it. ?I didn't got it. can you explain in detail?? ?>? The other is to build a vector > math library specifically for NumPy arrays and (g)ufuncs. The latter you > will most likely not be able to do in a summer. ?I have also came up with this approach but I am confused a bit with this approach.? > > You should also consider Numba > and Numexpr. They have some support for > vector math libraries too. > ?I will look into this. I think the actual problem is not "to choose which library to integrate", it is how to integrate these libraries? as I have seen the code base and been told the current implementation uses the c math library, Can we just use the current implementation and whenever it is calling ?C Maths functions, we will replace by these above fast library functions? Then we have to modify the Numpy library (which usually get imported for maths operation) by using some if else conditions like first work with the faster one and if it is not available the look for the Default one. Moreover, I have Another Doubt also. are we suppose to integrate just one fast library or more than one so that if one is not available, look for the second one and if second is not available then either go to default are look for the third one if available? Are we suppose to think like this: Let say "exp" is faster in sleef library so integrate sleef library for this operation and let say "sin" is faster in any other library, so integrate that library for sin operation? I mean, it may be possible that different operations are faster in different libraries So the implementation should be operation oriented or just integrate one complete library? ?Thanks? -- Durgesh Pandey, IIIT-Hyderabad,India. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Wed Mar 11 13:04:27 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Wed, 11 Mar 2015 18:04:27 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: > Am 08.03.2015 um 21:47 schrieb Dp Docs : > > Hi all, > I am a CS 3rd Undergrad. Student from an Indian Institute (III T). I > believe I am good in Programming languages like C/C++, Python as I > have already done Some Projects using these language as a part of my > academics. I really like Coding (Competitive as well as development). > I really want to get involved in Numpy Development Project and want to > take "Vector math library integration" as a part of my project. I > want to here any idea from your side for this project. > Thanks For your time for reading this email and responding back. > On the scipy mailing list I also answered to Amine, who is also interested in this proposal. Long time ago I wrote a package that provides fast math functions (ufuncs) for numpy, using Intel?s MKL/VML library, see https://github.com/geggo/uvml and my comments there. This code could be easily ported to use other vector math libraries. Would be interesting to evaluate other possibilities. Due to the fact that MKL is non-free, there are concerns to use it with numpy, although e.g. numpy and scipy using the MKL LAPACK routines are used frequently (Anaconda or Christoph Gohlkes binaries). You can easily inject the fast math ufuncs into numpy, e.g. with set_numeric_ops() or np.sin = vml.sin. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From faltet at gmail.com Wed Mar 11 13:54:34 2015 From: faltet at gmail.com (Francesc Alted) Date: Wed, 11 Mar 2015 18:54:34 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: 2015-03-08 21:47 GMT+01:00 Dp Docs : > Hi all, > I am a CS 3rd Undergrad. Student from an Indian Institute (III T). I > believe I am good in Programming languages like C/C++, Python as I > have already done Some Projects using these language as a part of my > academics. I really like Coding (Competitive as well as development). > I really want to get involved in Numpy Development Project and want to > take "Vector math library integration" as a part of my project. I > want to here any idea from your side for this project. > Thanks For your time for reading this email and responding back. > As Sturla and Gregor suggested, there are quite a few attempts to solve this shortcoming in NumPy. In particular Gregor integrated MKL/VML support in numexpr quite a long time ago, and when combined with my own implementation of pooled threads (behaving better than Intel's implementation in VML), then the thing literally flies: https://github.com/pydata/numexpr/wiki/NumexprMKL numba is also another interesting option and it shows much better compiling times than the integrated compiler in numexpr. You can see a quick comparison about expected performances between numexpr and numba: http://nbviewer.ipython.org/gist/anonymous/4117896 In general, numba wins for small arrays, but numexpr can achieve very good performance for larger ones. I think there are interesting things to discover in both projects, as for example, how they manage memory in order to avoid temporaries or how they deal with unaligned data efficiently. I would advise to look at existing docs and presentations explaining things in more detail too. All in all, I would really love to see such a vector math library support integrated in NumPy because frankly, I don't have bandwidth for maintaining numexpr anymore (and I am afraid that nobody else would jump in this ship ;). Good luck! Francesc > > My IRCnickname: dp > > Real Name: Durgesh Pandey. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Francesc Alted -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Mar 11 14:47:55 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Wed, 11 Mar 2015 19:47:55 +0100 Subject: [Numpy-discussion] SIMD programming in Cython Message-ID: So I just learned a new trick. This is a very nice one which can be nice to know about, so I thought I should share this: https://groups.google.com/forum/#!msg/cython-users/nTnyI7A6sMc/a6_GnOOsLuQJ Regards, Sturla From davidmenhur at gmail.com Wed Mar 11 16:31:34 2015 From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=) Date: Wed, 11 Mar 2015 21:31:34 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On 11 March 2015 at 16:51, Dp Docs wrote: > > > On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden > wrote: > > > ?Hi sturla, > Thanks for suggestion.? > > > There are several vector math libraries NumPy could use, e.g. MKL/VML, > > Apple Accelerate (vecLib), ACML, and probably others. > > ?Are these libraries fast enough in comparison to C maths libraries?? > These are the fastest beast out there, written in C, Assembly, and arcane incantations. > > There are at least two ways to proceed here. One is to only use vector > > math when strides and alignment allow it. > ?I didn't got it. can you explain in detail?? > One example, you can create a numpy 2D array using only the odd columns of a matrix. odd_matrix = full_matrix[::2, ::2] This is just a view of the original data, so you save the time and the memory of making a copy. The drawback is that you trash memory locality, as the elements are not contiguous in memory. If the memory is guaranteed to be contiguous, a compiler can apply extra optimisations, and this is what vector libraries usually assume. What I think Sturla is suggesting with "when strides and aligment allow it" is to use the fast version if the array is contiguous, and fall back to the present implementation otherwise. Another would be to make an optimally aligned copy, but that could eat up whatever time we save from using the faster library, and other problems. The difficulty with Numpy's strides is that they allow so many ways of manipulating the data... (alternating elements, transpositions, different precisions...). > I think the actual problem is not "to choose which library to integrate", > it is how to integrate these libraries? as I have seen the code base and > been told the current implementation uses the c math library, Can we just > use the current implementation and whenever it is calling ?C Maths > functions, we will replace by these above fast library functions?Then we > have to modify the Numpy library (which usually get imported for maths > operation) by using some if else conditions like first work with the faster > one and if it is not available the look for the Default one. > At the moment, we are linking to whichever LAPACK is avaliable at compile time, so no need for a runtime check. I guess it could (should?) be the same. > Moreover, I have Another Doubt also. are we suppose to integrate just one > fast library or more than one so that if one is not available, look for the > second one and if second is not available then either go to default are > look for the third one if available? > Are we suppose to think like this: Let say "exp" is faster in sleef > library so integrate sleef library for this operation and let say "sin" is > faster in any other library, so integrate that library for sin operation? I > mean, it may be possible that different operations are faster in different > libraries So the implementation should be operation oriented or just > integrate one complete library??Thanks? > Which one is faster depends on the hardware, the version of the library, and even the size of the problem: http://s3.postimg.org/wz0eis1o3/single.png I don't think you can reliably decide ahead of time which one should go for each operation. But, on the other hand, whichever one you go for will probably be fast enough for anyone using Python. Most of the work here is adapting Numpy's machinery to dispatch a call to the vector library, once that is ready, adding another one will hopefully be easier. At least, at the moment Numpy can use one of several linear algebra packages (MKL, ATLAS, CBLAS...) and they are added, I think, without too much pain (but maybe I am just far away from the screams of whoever did it). /David. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdpan21 at gmail.com Wed Mar 11 18:18:28 2015 From: sdpan21 at gmail.com (Dp Docs) Date: Thu, 12 Mar 2015 03:48:28 +0530 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On Wed, Mar 11, 2015 at 10:34 PM, Gregor Thalhammer < gregor.thalhammer at gmail.com> wrote: > > > On the scipy mailing list I also answered to Amine, who is also interested in this proposal. ?? Can you provide the link of that discussion? I am getting trouble in searching that. ?>? Long time ago I wrote a package that ? >? provides fast math functions (ufuncs) for numpy, using Intel?s MKL/VML library, see https://github.com/geggo/uvml and my comments ?>? there. This code could be easily ported to use other vector math libraries. ?When MKL is not available for a System, will this integration work with default numpy maths functions? ? ?>? Would be interesting to evaluate other possibilities. Due to ?>? the fact that MKL is non-free, there are concerns to use it with numpy, ?>? although e.g. numpy and scipy using the MKL LAPACK ?>? routines are used frequently (Anaconda or Christoph Gohlkes binaries). > > You can easily inject the fast math ufuncs into numpy, e.g. with set_numeric_ops() or np.sin = vml.sin. ?Can you explain in a bit detail or provide a link where i can see it?? ?Thanks for your valuable suggestion.? -- Durgesh Pandey, IIIT-Hyderabad,India. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdpan21 at gmail.com Wed Mar 11 18:20:03 2015 From: sdpan21 at gmail.com (Dp Docs) Date: Thu, 12 Mar 2015 03:50:03 +0530 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On Thu, Mar 12, 2015 at 2:01 AM, Da?id wrote: > > On 11 March 2015 at 16:51, Dp Docs wrote: >> On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden wrote: >> > >> > There are at least two ways to proceed here. One is to only use vector >> > math when strides and alignment allow it. >> I didn't got it. can you explain in detail? > > > One example, you can create a numpy 2D array using only the odd columns of a matrix. > > odd_matrix = full_matrix[::2, ::2] > > This is just a view of the original data, so you save the time and the memory of making a copy. The drawback is that you trash ?>? memory locality, as the elements are not contiguous in memory. If the memory is guaranteed to be contiguous, a compiler can apply ?>? extra optimisations, and this is what vector libraries usually assume. What I think Sturla is suggesting with "when strides and aligment ?>? allow it" is to use the fast version if the array is contiguous, and fall back to the present implementation otherwise. Another would be to ?>? make an optimally aligned copy, but that could eat up whatever time we save from using the faster library, and other problems. > > The difficulty with Numpy's strides is that they allow so many ways of manipulating the data... (alternating elements, transpositions, different precisions...). > >> >> I think the actual problem is not "to choose which library to integrate", it is how to integrate these libraries? as I have seen the code ?>>? base and been told the current implementation uses the c math library, Can we just use the current implementation and whenever it ?>>? is calling C Maths functions, we will replace by these above fast library functions?Then we have to modify the Numpy library (which ?>>? usually get imported for maths operation) by using some if else conditions like first work with the faster one and if it is not available ?>>? the look for the Default one. > > > At the moment, we are linking to whichever LAPACK is avaliable at compile time, so no need for a runtime check. I guess it could ?>? (should?) be the same. ?I didn't understand this. I was asking about let say I have chosen one faster library, now I need to integrate this? in *some way *without changing the default functionality so that when Numpy will import "from numpy import *",it should be able to access the integrated libraries functions as well as default libraries functions, What should we be that* some way*?? Even at the Compile, it need to decide that which Function it is going to use, right? It have been discussed above about integration of MKL libraries but when MKL is not available on the hardware Architecture, will the above library support as default library? if yes, then the Above discussed integration method may be the required one for integration in this project, right? Can you please tell me a bit more or provide some link related to that?? Availability of these faster Libraries depends on the Hardware Architectures etc. or availability of hardware Resources in a System? because if it is later one, this newly integrated library will support operations some time while sometimes not? I believe it's the first one but it is better to clear any type of confusion. For example, assuming availability of Hardware means later one, let say if library A needed the A1 for it's support and A1 is busy then it will not be able to support the operation. Meanwhile, library B, needs Support of hardware type B1 , and it's not Busy then it will support these operations. What I want to say is Assuming the Availability of faster lib. means availability of hardware Resources in a System at a particular time when we want to do operation, it's totally unpredictable and Availability of these resources will be Random and even worse, if it take a bit extra time between compile and running, and that h/d resource have been allocated to other process in the meantime then it would be very problematic to use these operations. So this leads to think that Availability of lib. means type of h/d architecture whether it supports or not that lib. Since there are many kind of h/d architecture and it is not the case that one library support all these architectures (though it may be), So we need to integrate more than one lib. for providing support to all kind of architecture (in ideal case which will make it to be a very big project). > >> >> Moreover, I have Another Doubt also. are we suppose to integrate just one fast library or more than one so that if one is not available, look for the second one and if second is not available then either go to default are look for the third one if available? >> Are we suppose to think like this: Let say "exp" is faster in sleef library so integrate sleef library for this operation and let say "sin" is faster in any other library, so integrate that library for sin operation? I mean, it may be possible that different operations are faster in different libraries So the implementation should be operation oriented or just integrate one complete library?Thanks > > > Which one is faster depends on the hardware, the version of the library, and even the size of the problem: > http://s3.postimg.org/wz0eis1o3/single.png > > I don't think you can reliably decide ahead of time which one should go for each operation. But, on the other hand, whichever one you ?>? go for will probably be fast enough for anyone using Python. Most of the work here is adapting Numpy's machinery to dispatch a call to ?>? the vector library, once that is ready, adding another one will hopefully be easier. At least, at the moment Numpy can use one of ?>? several linear algebra packages (MKL, ATLAS, CBLAS...) and they are added, I think, without too much pain (but maybe I am just far ?>? away from the screams of whoever did it). > ???So we are supposed to integrate just one of these libraries?(rest will use default if they didn't support) ?MKL seems to be good but as it have been discussed above that it's non-free and it have been integrated also, can you suggest any other library which at least approximate MKL in a better way? Though Eigen seems to be good, but it's seems to be worse in middle ranges. can you provide any link which provide comparative information about all available vector libraries(Free)??? Thanks and regards, -- Durgesh Pandey, IIIT-Hyderabad,India. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sturla.molden at gmail.com Wed Mar 11 20:19:55 2015 From: sturla.molden at gmail.com (Sturla Molden) Date: Thu, 12 Mar 2015 01:19:55 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On 11/03/15 23:20, Dp Docs wrote: > ???So we are supposed to integrate just one of these libraries? As a Mac user I would be annoyed if we only supported MKL and not Accelerate Framework. AMD LibM should be supported too. MKL is non-free, but we use it for BLAS and LAPACK. AMD LibM is non-free in a similar manner. Acelerate Framework (vecLib) is a part of Apple's operating systems. You can abstract out the differences. Eigen is C++. We do not use C++ in NumPy, only C, Python and some Cython. C++ and Fortran can be used in SciPy, but not in NumPy. Apple's reference is here: https://developer.apple.com/library/mac/documentation/Performance/Conceptual/vecLib/index.html AMD's information is here: http://developer.amd.com/tools-and-sdks/cpu-development/libm/ (Vector math functions seem to be moved from ACML to LibM) Intel's reference is here: https://software.intel.com/sites/products/documentation/doclib/iss/2013/mkl/mklman/GUID-59EC4B87-29C8-4FB4-B57C-D269E6364954.htm Sturla From ralf.gommers at gmail.com Thu Mar 12 03:11:23 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 12 Mar 2015 08:11:23 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: On Wed, Mar 11, 2015 at 11:20 PM, Dp Docs wrote: > > > On Thu, Mar 12, 2015 at 2:01 AM, Da?id wrote: > > > > On 11 March 2015 at 16:51, Dp Docs wrote: > >> On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden > wrote: > >> > > >> > There are at least two ways to proceed here. One is to only use vector > >> > math when strides and alignment allow it. > >> I didn't got it. can you explain in detail? > > > > > > One example, you can create a numpy 2D array using only the odd columns > of a matrix. > > > > odd_matrix = full_matrix[::2, ::2] > > > > This is just a view of the original data, so you save the time and the > memory of making a copy. The drawback is that you trash > ?>? > memory locality, as the elements are not contiguous in memory. If the > memory is guaranteed to be contiguous, a compiler can apply > ?>? > extra optimisations, and this is what vector libraries usually assume. > What I think Sturla is suggesting with "when strides and aligment > ?>? > allow it" is to use the fast version if the array is contiguous, and fall > back to the present implementation otherwise. Another would be to > ?>? > make an optimally aligned copy, but that could eat up whatever time we > save from using the faster library, and other problems. > > > > The difficulty with Numpy's strides is that they allow so many ways of > manipulating the data... (alternating elements, transpositions, different > precisions...). > > > >> > >> I think the actual problem is not "to choose which library to > integrate", it is how to integrate these libraries? as I have seen the code > ?>>? > base and been told the current implementation uses the c math library, Can > we just use the current implementation and whenever it > ?>>? > is calling C Maths functions, we will replace by these above fast library > functions?Then we have to modify the Numpy library (which > ?>>? > usually get imported for maths operation) by using some if else conditions > like first work with the faster one and if it is not available > ?>>? > the look for the Default one. > > > > > > At the moment, we are linking to whichever LAPACK is avaliable at > compile time, so no need for a runtime check. I guess it could > ?>? > (should?) be the same. > ?I didn't understand this. I was asking about let say I have chosen one > faster library, now I need to integrate this? in *some way *without > changing the default functionality so that when Numpy will import "from > numpy import *",it should be able to access the integrated libraries > functions as well as default libraries functions, What should we be that* some > way*?? Even at the Compile, it need to decide that which Function it is > going to use, right? > Indeed, it should probably work similar to how BLAS/LAPACK functions are treated now. So you can support multiple libraries in numpy (pick only one to start with of course), but at compile time you'd pick the one to use. Then that library gets always called under the hood, i.e. no new public functions/objects in numpy but only improved performance of existing ones. It have been discussed above about integration of MKL libraries but when > MKL is not available on the hardware Architecture, will the above library > support as default library? if yes, then the Above discussed integration > method may be the required one for integration in this project, right? > Can you please tell me a bit more or provide some link related to that?? > Availability of these faster Libraries depends on the Hardware > Architectures etc. or availability of hardware Resources in a System? > because if it is later one, this newly integrated library will support > operations some time while sometimes not? > Not HW resources I'd think. Looking at http://www.yeppp.info, it supports all commonly used cpus/instruction sets. As long as the accuracy of the library is OK this should not be noticeable to users except for the difference in performance. > I believe it's the first one but it is better to clear any type of > confusion. For example, assuming availability of Hardware means later one, > let say if library A needed the A1 for it's support and A1 is busy then it > will not be able to support the operation. Meanwhile, library B, needs > Support of hardware type B1 , and it's not Busy then it will support these > operations. What I want to say is Assuming the Availability of faster lib. > means availability of hardware Resources in a System at a particular time > when we want to do operation, it's totally unpredictable and Availability > of these resources will be Random and even worse, if it take a bit extra > time between compile and running, and that h/d resource have been allocated > to other process in the meantime then it would be very problematic to use > these operations. So this leads to think that Availability of lib. means > type of h/d architecture whether it supports or not that lib. Since there > are many kind of h/d architecture and it is not the case that one library > support all these architectures (though it may be), So we need to integrate > more than one lib. for providing support to all kind of architecture (in > ideal case which will make it to be a very big project). > > > >> > >> Moreover, I have Another Doubt also. are we suppose to integrate just > one fast library or more than one so that if one is not available, look for > the second one and if second is not available then either go to default are > look for the third one if available? > >> Are we suppose to think like this: Let say "exp" is faster in sleef > library so integrate sleef library for this operation and let say "sin" is > faster in any other library, so integrate that library for sin operation? I > mean, it may be possible that different operations are faster in different > libraries So the implementation should be operation oriented or just > integrate one complete library?Thanks > > > > > > Which one is faster depends on the hardware, the version of the library, > and even the size of the problem: > > http://s3.postimg.org/wz0eis1o3/single.png > > > > I don't think you can reliably decide ahead of time which one should go > for each operation. But, on the other hand, whichever one you > ?>? > go for will probably be fast enough for anyone using Python. Most of the > work here is adapting Numpy's machinery to dispatch a call to > ?>? > the vector library, once that is ready, adding another one will hopefully > be easier. At least, at the moment Numpy can use one of > ?>? > several linear algebra packages (MKL, ATLAS, CBLAS...) and they are added, > I think, without too much pain (but maybe I am just far > ?>? > away from the screams of whoever did it). > > > ???So we are supposed to integrate just one of these libraries?(rest will > use default if they didn't support) ?MKL seems to be good but as it have > been discussed above that it's non-free and it have been integrated also, > can you suggest any other library which at least approximate MKL in a > better way? Though Eigen seems to be good, but it's seems to be worse in > middle ranges. can you provide any link which provide comparative > information about all available vector libraries(Free)??? > The idea on the GSoC page suggests http://www.yeppp.info/ or SLEEF ( http://shibatch.sourceforge.net/). Based on those websites I'm 99.9% sure that yeppp is a better bet. At least its benchmarks say that it's faster than MKL. As for the project, Julian (who'd likely be the main mentor) has already indicated when suggesting the idea that he has no interest in a non-free library: http://comments.gmane.org/gmane.comp.python.numeric.general/56933. So Yeppp + the build architecture to support multiple libraries later on would probably be a good target. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdpan21 at gmail.com Thu Mar 12 05:03:17 2015 From: sdpan21 at gmail.com (Dp Docs) Date: Thu, 12 Mar 2015 14:33:17 +0530 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: Thanks to all of you for such a nice Discussion and Suggestion. I think most of my doubts have been resolved. If there will be something more i will let you people Know. Thanks again. -- Durgesh pandey IIIT Hyderabad,India -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Thu Mar 12 05:15:10 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Thu, 12 Mar 2015 10:15:10 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: References: Message-ID: <0C44B0F4-D0F6-474A-AD0D-E2A486165691@gmail.com> > Am 11.03.2015 um 23:18 schrieb Dp Docs : > > > > On Wed, Mar 11, 2015 at 10:34 PM, Gregor Thalhammer > wrote: > > > > > > On the scipy mailing list I also answered to Amine, who is also interested in this proposal. > ?? Can you provide the link of that discussion? I am getting trouble in searching that. > > ?>?Long time ago I wrote a package that ? > >?provides fast math functions (ufuncs) for numpy, using Intel?s MKL/VML library, see https://github.com/geggo/uvml and my comments ?>?there. This code could be easily ported to use other vector math libraries. > > ?When MKL is not available for a System, will this integration work with default numpy maths functions? > ? > ?>? Would be interesting to evaluate other possibilities. Due to ?>?the fact that MKL is non-free, there are concerns to use it with numpy, ?>?although e.g. numpy and scipy using the MKL LAPACK ?>?routines are used frequently (Anaconda or Christoph Gohlkes binaries). > > > > You can easily inject the fast math ufuncs into numpy, e.g. with set_numeric_ops() or np.sin = vml.sin. > > ?Can you explain in a bit detail or provide a link where i can see it?? My approach for https://github.com/geggo/uvml was to provide a separate python extension that provides faster numpy ufuncs for math operations like exp, sin, cos, ? To replace the standard numpy ufuncs by the optimized ones you don?t need to apply changes to the source code of numpy, instead at runtime you monkey patch it and get faster math everywhere. Numpy even offers an interface (set_numeric_ops) to modify it at runtime. Another note, numpy makes it easy to provide new ufuncs, see http://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-tutorial.html from a C function that operates on 1D arrays, but this function needs to support arbitrary spacing (stride) between the items. Unfortunately, to achieve good performance, vector math libraries often expect that the items are laid out contiguously in memory. MKL/VML is a notable exception. So for non contiguous in- or output arrays you might need to copy the data to a buffer, which likely kills large amounts of the performance gain. This does not completely rule out some of the libraries, since performance critical data is likely to be stored in contiguous arrays. Using a library that supports only vector math for contiguous arrays is more difficult, but perhaps the numpy nditer provides everything needed. Gregor -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu Mar 12 08:14:17 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 12 Mar 2015 08:14:17 -0400 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" References: Message-ID: Ralf Gommers wrote: > On Wed, Mar 11, 2015 at 11:20 PM, Dp Docs wrote: > >> >> >> On Thu, Mar 12, 2015 at 2:01 AM, Da?id wrote: >> > >> > On 11 March 2015 at 16:51, Dp Docs wrote: >> >> On Wed, Mar 11, 2015 at 7:52 PM, Sturla Molden >> >> >> wrote: >> >> > >> >> > There are at least two ways to proceed here. One is to only use >> >> > vector math when strides and alignment allow it. >> >> I didn't got it. can you explain in detail? >> > >> > >> > One example, you can create a numpy 2D array using only the odd columns >> of a matrix. >> > >> > odd_matrix = full_matrix[::2, ::2] >> > >> > This is just a view of the original data, so you save the time and the >> memory of making a copy. The drawback is that you trash >> ?>? >> memory locality, as the elements are not contiguous in memory. If the >> memory is guaranteed to be contiguous, a compiler can apply >> ?>? >> extra optimisations, and this is what vector libraries usually assume. >> What I think Sturla is suggesting with "when strides and aligment >> ?>? >> allow it" is to use the fast version if the array is contiguous, and fall >> back to the present implementation otherwise. Another would be to >> ?>? >> make an optimally aligned copy, but that could eat up whatever time we >> save from using the faster library, and other problems. >> > >> > The difficulty with Numpy's strides is that they allow so many ways of >> manipulating the data... (alternating elements, transpositions, different >> precisions...). >> > >> >> >> >> I think the actual problem is not "to choose which library to >> integrate", it is how to integrate these libraries? as I have seen the >> code ?>>? >> base and been told the current implementation uses the c math library, >> Can >> we just use the current implementation and whenever it >> ?>>? >> is calling C Maths functions, we will replace by these above fast library >> functions?Then we have to modify the Numpy library (which >> ?>>? >> usually get imported for maths operation) by using some if else >> conditions >> like first work with the faster one and if it is not available >> ?>>? >> the look for the Default one. >> > >> > >> > At the moment, we are linking to whichever LAPACK is avaliable at >> compile time, so no need for a runtime check. I guess it could >> ?>? >> (should?) be the same. >> ?I didn't understand this. I was asking about let say I have chosen one >> faster library, now I need to integrate this? in *some way *without >> changing the default functionality so that when Numpy will import "from >> numpy import *",it should be able to access the integrated libraries >> functions as well as default libraries functions, What should we be that* >> some way*?? Even at the Compile, it need to decide that which Function it >> is going to use, right? >> > > Indeed, it should probably work similar to how BLAS/LAPACK functions are > treated now. So you can support multiple libraries in numpy (pick only one > to start with of course), but at compile time you'd pick the one to use. > Then that library gets always called under the hood, i.e. no new public > functions/objects in numpy but only improved performance of existing ones. > > It have been discussed above about integration of MKL libraries but when >> MKL is not available on the hardware Architecture, will the above library >> support as default library? if yes, then the Above discussed integration >> method may be the required one for integration in this project, right? >> Can you please tell me a bit more or provide some link related to that?? >> Availability of these faster Libraries depends on the Hardware >> Architectures etc. or availability of hardware Resources in a System? >> because if it is later one, this newly integrated library will support >> operations some time while sometimes not? >> > > Not HW resources I'd think. Looking at http://www.yeppp.info, it supports > all commonly used cpus/instruction sets. > As long as the accuracy of the library is OK this should not be noticeable > to users except for the difference in performance. > > >> I believe it's the first one but it is better to clear any type of >> confusion. For example, assuming availability of Hardware means later >> one, >> let say if library A needed the A1 for it's support and A1 is busy then >> it >> will not be able to support the operation. Meanwhile, library B, needs >> Support of hardware type B1 , and it's not Busy then it will support >> these operations. What I want to say is Assuming the Availability of >> faster lib. means availability of hardware Resources in a System at a >> particular time when we want to do operation, it's totally unpredictable >> and Availability of these resources will be Random and even worse, if it >> take a bit extra time between compile and running, and that h/d resource >> have been allocated to other process in the meantime then it would be >> very problematic to use these operations. So this leads to think that >> Availability of lib. means type of h/d architecture whether it supports >> or not that lib. Since there are many kind of h/d architecture and it is >> not the case that one library support all these architectures (though it >> may be), So we need to integrate more than one lib. for providing support >> to all kind of architecture (in ideal case which will make it to be a >> very big project). >> > >> >> >> >> Moreover, I have Another Doubt also. are we suppose to integrate just >> one fast library or more than one so that if one is not available, look >> for the second one and if second is not available then either go to >> default are look for the third one if available? >> >> Are we suppose to think like this: Let say "exp" is faster in sleef >> library so integrate sleef library for this operation and let say "sin" >> is faster in any other library, so integrate that library for sin >> operation? I mean, it may be possible that different operations are >> faster in different libraries So the implementation should be operation >> oriented or just integrate one complete library?Thanks >> > >> > >> > Which one is faster depends on the hardware, the version of the >> > library, >> and even the size of the problem: >> > http://s3.postimg.org/wz0eis1o3/single.png >> > >> > I don't think you can reliably decide ahead of time which one should go >> for each operation. But, on the other hand, whichever one you >> ?>? >> go for will probably be fast enough for anyone using Python. Most of the >> work here is adapting Numpy's machinery to dispatch a call to >> ?>? >> the vector library, once that is ready, adding another one will hopefully >> be easier. At least, at the moment Numpy can use one of >> ?>? >> several linear algebra packages (MKL, ATLAS, CBLAS...) and they are >> added, I think, without too much pain (but maybe I am just far >> ?>? >> away from the screams of whoever did it). >> > >> ???So we are supposed to integrate just one of these libraries?(rest will >> use default if they didn't support) ?MKL seems to be good but as it have >> been discussed above that it's non-free and it have been integrated also, >> can you suggest any other library which at least approximate MKL in a >> better way? Though Eigen seems to be good, but it's seems to be worse in >> middle ranges. can you provide any link which provide comparative >> information about all available vector libraries(Free)??? >> > > The idea on the GSoC page suggests http://www.yeppp.info/ or SLEEF ( > http://shibatch.sourceforge.net/). Based on those websites I'm 99.9% sure > that yeppp is a better bet. At least its benchmarks say that it's faster > than MKL. As for the project, Julian (who'd likely be the main mentor) has > already indicated when suggesting the idea that he has no interest in a > non-free library: > http://comments.gmane.org/gmane.comp.python.numeric.general/56933. So > Yeppp + the build architecture to support multiple libraries later on > would probably be a good target. > > Cheers, > Ralf Thanks for the tip about yeppp. While it looks interesting, it seems to be pretty limited. Just a few transcendental functions. I didn't notice complex either (e.g., dot product). -- Those who fail to understand recursion are doomed to repeat it From jtaylor.debian at googlemail.com Thu Mar 12 08:48:06 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 12 Mar 2015 13:48:06 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: <0C44B0F4-D0F6-474A-AD0D-E2A486165691@gmail.com> References: <0C44B0F4-D0F6-474A-AD0D-E2A486165691@gmail.com> Message-ID: <55018B06.7050804@googlemail.com> On 03/12/2015 10:15 AM, Gregor Thalhammer wrote: > > Another note, numpy makes it easy to provide new ufuncs, see > http://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-tutorial.html > from a C function that operates on 1D arrays, but this function needs to > support arbitrary spacing (stride) between the items. Unfortunately, to > achieve good performance, vector math libraries often expect that the > items are laid out contiguously in memory. MKL/VML is a notable > exception. So for non contiguous in- or output arrays you might need to > copy the data to a buffer, which likely kills large amounts of the > performance gain. The elementary functions are very slow even compared to memory access, they take in the orders of hundreds to tens of thousand cycles to complete (depending on range and required accuracy). Even in the case of strided access that gives the hardware prefetchers plenty of time to load the data before the previous computation is done. This also removes the requirement from the library to provide a strided api, we can copy the strided data into a contiguous buffer and pass it to the library without losing much performance. It may not be optimal (e.g. a library can fine tune the prefetching better for the case where the hardware is not ideal) but most likely sufficient. Figuring out how to best do it to get the best performance and still being flexible in what implementation is used is part of the challenge the student will face for this project. From johannes.kulick at ipvs.uni-stuttgart.de Thu Mar 12 09:31:51 2015 From: johannes.kulick at ipvs.uni-stuttgart.de (Johannes Kulick) Date: Thu, 12 Mar 2015 14:31:51 +0100 Subject: [Numpy-discussion] tie breaking for max, min, argmax, argmin Message-ID: <20150312133151.14361.35612@quirm.robotics.tu-berlin.de> Hello, I wonder if it would be worth to enhance max, min, argmax and argmin (more?) with a tie breaking parameter: If multiple entries have the same value the first value is returned by now. It would be useful to have a parameter to alter this behavior to an arbitrary tie-breaking. I would propose, that the tie-breaking function gets a list with all indices of the max/mins. Example: >>> a = np.array([ 1, 2, 5, 5, 2, 1]) >>> np.argmax(a, tie_breaking=random.choice) 3 >>> np.argmax(a, tie_breaking=random.choice) 2 >>> np.argmax(a, tie_breaking=random.choice) 2 >>> np.argmax(a, tie_breaking=random.choice) 2 >>> np.argmax(a, tie_breaking=random.choice) 3 Especially for some randomized experiments it is necessary that not always the first maximum is returned, but a random optimum. Thus I end up writing these things over and over again. I understand, that max and min are crucial functions, which shouldn't be slowed down by the proposed changes. Adding new functions instead of altering the existing ones would be a good option. Are there any concerns against me implementing these things and sending a pull request? Should such a function better be included in scipy for example? Best, Johannes From robert.kern at gmail.com Thu Mar 12 09:42:26 2015 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Mar 2015 13:42:26 +0000 Subject: [Numpy-discussion] tie breaking for max, min, argmax, argmin In-Reply-To: <20150312133151.14361.35612@quirm.robotics.tu-berlin.de> References: <20150312133151.14361.35612@quirm.robotics.tu-berlin.de> Message-ID: On Thu, Mar 12, 2015 at 1:31 PM, Johannes Kulick < johannes.kulick at ipvs.uni-stuttgart.de> wrote: > > Hello, > > I wonder if it would be worth to enhance max, min, argmax and argmin (more?) > with a tie breaking parameter: If multiple entries have the same value the first > value is returned by now. It would be useful to have a parameter to alter this > behavior to an arbitrary tie-breaking. I would propose, that the tie-breaking > function gets a list with all indices of the max/mins. > > Example: > >>> a = np.array([ 1, 2, 5, 5, 2, 1]) > >>> np.argmax(a, tie_breaking=random.choice) > 3 > > >>> np.argmax(a, tie_breaking=random.choice) > 2 > > >>> np.argmax(a, tie_breaking=random.choice) > 2 > > >>> np.argmax(a, tie_breaking=random.choice) > 2 > > >>> np.argmax(a, tie_breaking=random.choice) > 3 > > Especially for some randomized experiments it is necessary that not always the > first maximum is returned, but a random optimum. Thus I end up writing these > things over and over again. > > I understand, that max and min are crucial functions, which shouldn't be slowed > down by the proposed changes. Adding new functions instead of altering the > existing ones would be a good option. > > Are there any concerns against me implementing these things and sending a pull > request? Should such a function better be included in scipy for example? On the whole, I think I would prefer new functions for this. I assume you only need variants for argmin() and argmax() and not min() and max(), since all of the tied values for the latter two would be identical, so returning the first one is just as good as any other. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Mar 12 09:49:10 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 12 Mar 2015 14:49:10 +0100 Subject: [Numpy-discussion] tie breaking for max, min, argmax, argmin In-Reply-To: References: <20150312133151.14361.35612@quirm.robotics.tu-berlin.de> Message-ID: <55019956.9000806@googlemail.com> On 03/12/2015 02:42 PM, Robert Kern wrote: > On Thu, Mar 12, 2015 at 1:31 PM, Johannes Kulick > > wrote: >> >> Hello, >> >> I wonder if it would be worth to enhance max, min, argmax and argmin > (more?) >> with a tie breaking parameter: If multiple entries have the same value > the first >> value is returned by now. It would be useful to have a parameter to > alter this >> behavior to an arbitrary tie-breaking. I would propose, that the > tie-breaking >> function gets a list with all indices of the max/mins. >> >> Example: >> >>> a = np.array([ 1, 2, 5, 5, 2, 1]) >> >>> np.argmax(a, tie_breaking=random.choice) >> 3 >> >> >>> np.argmax(a, tie_breaking=random.choice) >> 2 >> >> >>> np.argmax(a, tie_breaking=random.choice) >> 2 >> >> >>> np.argmax(a, tie_breaking=random.choice) >> 2 >> >> >>> np.argmax(a, tie_breaking=random.choice) >> 3 >> >> Especially for some randomized experiments it is necessary that not > always the >> first maximum is returned, but a random optimum. Thus I end up writing > these >> things over and over again. >> >> I understand, that max and min are crucial functions, which shouldn't > be slowed >> down by the proposed changes. Adding new functions instead of altering the >> existing ones would be a good option. >> >> Are there any concerns against me implementing these things and > sending a pull >> request? Should such a function better be included in scipy for example? > > On the whole, I think I would prefer new functions for this. I assume > you only need variants for argmin() and argmax() and not min() and > max(), since all of the tied values for the latter two would be > identical, so returning the first one is just as good as any other. > is this such a common usecase that its worth a numpy function to replace one liners like this? np.random.choice(np.where(a == a.max())[0]) its also not that inefficient if the number of equal elements is not too large. From charlesr.harris at gmail.com Thu Mar 12 20:02:25 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Mar 2015 18:02:25 -0600 Subject: [Numpy-discussion] Numpy where Message-ID: Hi All, This is apropos gh-5582 dealing with some corner cases of np.where. The following are the current behavior >>> import numpy >>> numpy.where(True) # case 1 ... (array([0]),) >>> numpy.where(True, None, None) # case 2 ... array(None, dtype=object) >>> numpy.ma.where(True) # case 3 ... (array([0]),) >>> numpy.ma.where(True, None, None) # case 4 ... (array([0]),) The question is, what exactly should be done in these cases? I'd be inclined to raise an error for cases 1 and 3. Case two looks correct to me if we agree that scalar inputs are acceptable. Case 4 looks wrong. Thoughts? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Thu Mar 12 21:25:05 2015 From: njs at pobox.com (Nathaniel Smith) Date: Thu, 12 Mar 2015 18:25:05 -0700 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: On Mar 12, 2015 5:02 PM, "Charles R Harris" wrote: > > Hi All, > > This is apropos gh-5582 dealing with some corner cases of np.where. The following are the current behavior > > >>> import numpy > >>> numpy.where(True) # case 1 > ... (array([0]),) > >>> numpy.where(True, None, None) # case 2 > ... array(None, dtype=object) > >>> numpy.ma.where(True) # case 3 > ... (array([0]),) > >>> numpy.ma.where(True, None, None) # case 4 > ... (array([0]),) > > The question is, what exactly should be done in these cases? I'd be inclined to raise an error for cases 1 and 3. Case two looks correct to me if we agree that scalar inputs are acceptable. Case 4 looks wrong. I can't think of any reason scalars wouldn't be acceptable. So everything you suggest sounds right to me. -n Hi All, This is apropos gh-5582 dealing with some corner cases of np.where. The following are the current behavior >>> import numpy >>> numpy.where(True) # case 1 ... (array([0]),) >>> numpy.where(True, None, None) # case 2 ... array(None, dtype=object) >>> numpy.ma.where(True) # case 3 ... (array([0]),) >>> numpy.ma.where(True, None, None) # case 4 ... (array([0]),) The question is, what exactly should be done in these cases? I'd be inclined to raise an error for cases 1 and 3. Case two looks correct to me if we agree that scalar inputs are acceptable. Case 4 looks wrong. Thoughts? Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Fri Mar 13 00:35:57 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 13 Mar 2015 00:35:57 -0400 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: I think the question is if scalars should be acceptable for the first argument, not if it should be for the 2nd and 3rd argument. If scalar can be given for the first argument, the the first three makes sense. Although, I have no clue why we would allow that. Ben Root On Mar 12, 2015 9:25 PM, "Nathaniel Smith" wrote: > On Mar 12, 2015 5:02 PM, "Charles R Harris" > wrote: > > > > Hi All, > > > > This is apropos gh-5582 dealing with some corner cases of np.where. The > following are the current behavior > > > > >>> import numpy > > >>> numpy.where(True) # case 1 > > ... (array([0]),) > > >>> numpy.where(True, None, None) # case 2 > > ... array(None, dtype=object) > > >>> numpy.ma.where(True) # case 3 > > ... (array([0]),) > > >>> numpy.ma.where(True, None, None) # case 4 > > ... (array([0]),) > > > > The question is, what exactly should be done in these cases? I'd be > inclined to raise an error for cases 1 and 3. Case two looks correct to me > if we agree that scalar inputs are acceptable. Case 4 looks wrong. > > I can't think of any reason scalars wouldn't be acceptable. So everything > you suggest sounds right to me. > > -n > Hi All, > > This is apropos gh-5582 > dealing with some corner cases of np.where. The following are the current > behavior > > >>> import numpy > >>> numpy.where(True) # case 1 > ... (array([0]),) > >>> numpy.where(True, None, None) # case 2 > ... array(None, dtype=object) > >>> numpy.ma.where(True) # case 3 > ... (array([0]),) > >>> numpy.ma.where(True, None, None) # case 4 > ... (array([0]),) > > The question is, what exactly should be done in these cases? I'd be > inclined to raise an error for cases 1 and 3. Case two looks correct to me > if we agree that scalar inputs are acceptable. Case 4 looks wrong. > > Thoughts? > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 13 01:16:33 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 12 Mar 2015 23:16:33 -0600 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Sun, Mar 8, 2015 at 3:43 PM, Ralf Gommers wrote: > > > On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Time to start thinking about numpy 1.10. >> > > Sounds good. Do we have a volunteer for release manager already? > I guess it is my turn, unless someone else wants the experience. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Fri Mar 13 02:29:09 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 12 Mar 2015 23:29:09 -0700 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Thu, Mar 12, 2015 at 10:16 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Sun, Mar 8, 2015 at 3:43 PM, Ralf Gommers > wrote: > >> >> >> On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> Time to start thinking about numpy 1.10. >>> >> >> Sounds good. Do we have a volunteer for release manager already? >> > > I guess it is my turn, unless someone else wants the experience. > What does a release manager do? I will eventually want to be able to tell my grandchildren that I once managed a numpy release, but am not sure if I can successfully handle it on my own right now. I will probably need to up my git foo, which is nothing to write home about... Maybe for this one I can sign up for release minion, so you have someone to offload menial tasks? Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Mar 13 02:51:40 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 13 Mar 2015 07:51:40 +0100 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: On Fri, Mar 13, 2015 at 7:29 AM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Thu, Mar 12, 2015 at 10:16 PM, Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Sun, Mar 8, 2015 at 3:43 PM, Ralf Gommers >> wrote: >> >>> >>> >>> On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> Time to start thinking about numpy 1.10. >>>> >>> >>> Sounds good. Do we have a volunteer for release manager already? >>> >> >> I guess it is my turn, unless someone else wants the experience. >> > > What does a release manager do? I will eventually want to be able to tell > my grandchildren that I once managed a numpy release, but am not sure if I > can successfully handle it on my own right now. I will probably need to up > my git foo, which is nothing to write home about... > > Maybe for this one I can sign up for release minion, so you have someone > to offload menial tasks? > I have no doubt that you can do this job well right now - you are vastly more experienced than I was when I picked up that role. It's not rocket science. I have to run now, but here's a start to give you an idea of what it entails (some details and version numbers may be slightly outdated): https://github.com/numpy/numpy/blob/master/doc/HOWTO_RELEASE.rst.txt Cheers, Ralf > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dgasmith at icloud.com Fri Mar 13 10:22:14 2015 From: dgasmith at icloud.com (Daniel Smith) Date: Fri, 13 Mar 2015 09:22:14 -0500 Subject: [Numpy-discussion] Custom __array_interface__ error Message-ID: <66A012E1-B993-45CA-AAAF-8B60D1ED6305@icloud.com> Greetings everyone, I have a new project that deals with core and disk tensors wrapped into a single object so that the expressions are transparent to the user after the tensor is formed. I would like to add __array_interface__ to the core tensor and provide a reasonable error message if someone tries to call the __array_interface__ for a disk tensor. I may be missing something, but I do not see an obvious way to do this in the python layer. Currently I do something like: if ttype == ?Core": self.__array_interface__ = self.tensor.ndarray_interface() else: self.__array_interface__ = {'typestr?: 'Only Core tensor types are supported.?} Which provides at least a readable error message if it is not a core tensor: TypeError: data type "Only Core tensor types are supported." not understood A easy solution I see is to change numpy C side __array_interface__ error message to throw custom strings. In numpy/core/src/multiarray/ctors.c:2100 we have the __array_interface__ conversion: if (!PyDict_Check(iface)) { Py_DECREF(iface); PyErr_SetString(PyExc_ValueError, "Invalid __array_interface__ value, must be a dict"); return NULL; } It could simply be changed to: if (!PyDict_Check(iface)) { if (PyString_Check(iface)){ PyErr_SetString(PyExc_ValueError, iface); } else{ PyErr_SetString(PyExc_ValueError, "Invalid __array_interface__ value, must be a dict?); } Py_DECREF(iface); return NULL; } Thoughts? Cheers, -Daniel Smith From ben.root at ou.edu Fri Mar 13 10:29:38 2015 From: ben.root at ou.edu (Benjamin Root) Date: Fri, 13 Mar 2015 10:29:38 -0400 Subject: [Numpy-discussion] Numpy 1.10 In-Reply-To: References: Message-ID: Release minion? Sounds a lot like an academic minion: https://twitter.com/academicminions On Fri, Mar 13, 2015 at 2:51 AM, Ralf Gommers wrote: > > > On Fri, Mar 13, 2015 at 7:29 AM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> On Thu, Mar 12, 2015 at 10:16 PM, Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Sun, Mar 8, 2015 at 3:43 PM, Ralf Gommers >>> wrote: >>> >>>> >>>> >>>> On Sat, Mar 7, 2015 at 12:40 AM, Charles R Harris < >>>> charlesr.harris at gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> Time to start thinking about numpy 1.10. >>>>> >>>> >>>> Sounds good. Do we have a volunteer for release manager already? >>>> >>> >>> I guess it is my turn, unless someone else wants the experience. >>> >> >> What does a release manager do? I will eventually want to be able to tell >> my grandchildren that I once managed a numpy release, but am not sure if I >> can successfully handle it on my own right now. I will probably need to up >> my git foo, which is nothing to write home about... >> >> Maybe for this one I can sign up for release minion, so you have someone >> to offload menial tasks? >> > > I have no doubt that you can do this job well right now - you are vastly > more experienced than I was when I picked up that role. It's not rocket > science. > > I have to run now, but here's a start to give you an idea of what it > entails (some details and version numbers may be slightly outdated): > https://github.com/numpy/numpy/blob/master/doc/HOWTO_RELEASE.rst.txt > > Cheers, > Ralf > > >> Jaime >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes >> de dominaci?n mundial. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeffreback at gmail.com Fri Mar 13 11:33:36 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Fri, 13 Mar 2015 11:33:36 -0400 Subject: [Numpy-discussion] Pandas v0.16.0 release candidate 1 Message-ID: Hi, I'm pleased to announce the availability of the first release candidate of Pandas 0.16.0. Please try this RC and report any issues here: Pandas Issues We will be releasing officially in 1 week or so. This is a major release from 0.15.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. - Highlights include: - DataFrame.assign method, see *here* - Series.to_coo/from_coo methods to interact with scipy.sparse, see *here* - Backwards incompatible change to Timedelta to conform the .seconds attribute with datetime.timedelta, see *here* - Changes to the .loc slicing API to conform with the behavior of .ix see *here * - Changes to the default for ordering in the Categorical constructor, see *here * Here are the full whatsnew and documentation links: v0.16.0 Whatsnew Source tarballs, windows builds, and mac wheels are available here: Pandas v0.16.0rc1 Release A big thank you to everyone who contributed to this release! Jeff -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Fri Mar 13 11:57:00 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 13 Mar 2015 11:57:00 -0400 Subject: [Numpy-discussion] argument handling by uniform Message-ID: <550308CC.9080105@gmail.com> Today I accidentally wrote `uni = np.random.uniform((-0.5,0.5),201)`, supply a tuple instead of separate low and high values. This gave me two draws (from [0..201] I think). My question: how were the arguments interpreted? Thanks, Alan Isaac From sebastian at sipsolutions.net Fri Mar 13 12:01:15 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Fri, 13 Mar 2015 17:01:15 +0100 Subject: [Numpy-discussion] argument handling by uniform In-Reply-To: <550308CC.9080105@gmail.com> References: <550308CC.9080105@gmail.com> Message-ID: <1426262475.10094.0.camel@sipsolutions.net> On Fr, 2015-03-13 at 11:57 -0400, Alan G Isaac wrote: > Today I accidentally wrote `uni = np.random.uniform((-0.5,0.5),201)`, > supply a tuple instead of separate low and high values. This gave > me two draws (from [0..201] I think). My question: how were the > arguments interpreted? > I think all of random broadcasts all of these arguments. So if you give two bounds, then you get one of each, could even do a grid of parameter combinations. - Sebastian > Thanks, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From robert.kern at gmail.com Fri Mar 13 12:01:33 2015 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Mar 2015 16:01:33 +0000 Subject: [Numpy-discussion] argument handling by uniform In-Reply-To: <550308CC.9080105@gmail.com> References: <550308CC.9080105@gmail.com> Message-ID: On Fri, Mar 13, 2015 at 3:57 PM, Alan G Isaac wrote: > > Today I accidentally wrote `uni = np.random.uniform((-0.5,0.5),201)`, > supply a tuple instead of separate low and high values. This gave > me two draws (from [0..201] I think). My question: how were the > arguments interpreted? Broadcast against each other. Roughly equivalent to: uni = np.array([ np.random.uniform(-0.5, 201), np.random.uniform(0.5, 201), ]) -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewm at redtetrahedron.org Fri Mar 13 12:02:32 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Fri, 13 Mar 2015 12:02:32 -0400 Subject: [Numpy-discussion] argument handling by uniform In-Reply-To: <550308CC.9080105@gmail.com> References: <550308CC.9080105@gmail.com> Message-ID: `low` and `high` can be arrays so, you received 1 draw from (-0.5, 201) and 1 draw from (0.5, 201). Eric On Fri, Mar 13, 2015 at 11:57 AM, Alan G Isaac wrote: > Today I accidentally wrote `uni = np.random.uniform((-0.5,0.5),201)`, > supply a tuple instead of separate low and high values. This gave > me two draws (from [0..201] I think). My question: how were the > arguments interpreted? > > Thanks, > Alan Isaac > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jakirkham at gmail.com Fri Mar 13 12:29:18 2015 From: jakirkham at gmail.com (John Kirkham) Date: Fri, 13 Mar 2015 12:29:18 -0400 Subject: [Numpy-discussion] Numpy where Message-ID: Hey Everyone, I felt like I should add to the mix. I added the issue ( https://github.com/numpy/numpy/issues/5679 ) to tie these options together. My main concern is that both wheres behave the same. As far as using a scalar as the first argument, it was an easy example. We could have used actual arrays and we would have had arrays full of Nones. Though I do see this point of "why should we have indices returned for a scalar". This means nonzero should change, as well. I agree with Chuck on case 4. If we already allow all other scalars, None should be no different. Plus, I would rather have a breaking change in the ma module than in core. Best, John >> > I think the question is if scalars should be acceptable for the first >> > argument, not if it should be for the 2nd and 3rd argument. >> > >> > If scalar can be given for the first argument, the the first three makes >> > sense. Although, I have no clue why we would allow that. >> > >> > Ben Root >> > On Mar 12, 2015 9:25 PM, "Nathaniel Smith" wrote: >> >> > On Mar 12, 2015 5:02 PM, "Charles R Harris" >> > wrote: >> > > >> > > Hi All, >> > > >> > > This is apropos gh-5582 dealing with some corner cases of np.where. The >> > following are the current behavior >> > > >> > > >>> import numpy >> > > >>> numpy.where(True) # case 1 >> > > ... (array([0]),) >> > > >>> numpy.where(True, None, None) # case 2 >> > > ... array(None, dtype=object) >> > > >>> numpy.ma.where(True) # case 3 >> > > ... (array([0]),) >> > > >>> numpy.ma.where(True, None, None) # case 4 >> > > ... (array([0]),) >> > > >> > > The question is, what exactly should be done in these cases? I'd be >> > inclined to raise an error for cases 1 and 3. Case two looks correct to me >> > if we agree that scalar inputs are acceptable. Case 4 looks wrong. >> > >> > I can't think of any reason scalars wouldn't be acceptable. So everything >> > you suggest sounds right to me. >> > >> > -n >> > Hi All, >> > >> > This is apropos gh-5582 >> > dealing with some corner cases of np.where. The following are the current >> > behavior >> > >> > >>> import numpy >> > >>> numpy.where(True) # case 1 >> > ... (array([0]),) >> > >>> numpy.where(True, None, None) # case 2 >> > ... array(None, dtype=object) >> > >>> numpy.ma.where(True) # case 3 >> > ... (array([0]),) >> > >>> numpy.ma.where(True, None, None) # case 4 >> > ... (array([0]),) >> > >> > The question is, what exactly should be done in these cases? I'd be >> > inclined to raise an error for cases 1 and 3. Case two looks correct to me >> > if we agree that scalar inputs are acceptable. Case 4 looks wrong. >> > >> > Thoughts? >> > >> > Chuck >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Fri Mar 13 12:57:46 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Fri, 13 Mar 2015 09:57:46 -0700 Subject: [Numpy-discussion] Custom __array_interface__ error In-Reply-To: <66A012E1-B993-45CA-AAAF-8B60D1ED6305@icloud.com> References: <66A012E1-B993-45CA-AAAF-8B60D1ED6305@icloud.com> Message-ID: In my experience writing ndarray-like objects, you likely want to implement __array__ instead of __array_interface__. The former gives you full control to create the ndarray yourself. On Fri, Mar 13, 2015 at 7:22 AM, Daniel Smith wrote: > Greetings everyone, > I have a new project that deals with core and disk tensors wrapped into a > single object so that the expressions are transparent to the user after the > tensor is formed. I would like to add __array_interface__ to the core > tensor and provide a reasonable error message if someone tries to call the > __array_interface__ for a disk tensor. I may be missing something, but I do > not see an obvious way to do this in the python layer. > > Currently I do something like: > > if ttype == ?Core": > self.__array_interface__ = self.tensor.ndarray_interface() > else: > self.__array_interface__ = {'typestr?: 'Only Core tensor > types are supported.?} > > Which provides at least a readable error message if it is not a core > tensor: > TypeError: data type "Only Core tensor types are supported." not understood > > A easy solution I see is to change numpy C side __array_interface__ error > message to throw custom strings. > > In numpy/core/src/multiarray/ctors.c:2100 we have the __array_interface__ > conversion: > > if (!PyDict_Check(iface)) { > Py_DECREF(iface); > PyErr_SetString(PyExc_ValueError, > "Invalid __array_interface__ value, must be a dict"); > return NULL; > } > > It could simply be changed to: > > if (!PyDict_Check(iface)) { > if (PyString_Check(iface)){ > PyErr_SetString(PyExc_ValueError, iface); > } > else{ > PyErr_SetString(PyExc_ValueError, > "Invalid __array_interface__ value, must be a dict?); > } > Py_DECREF(iface); > return NULL; > } > > Thoughts? > > Cheers, > -Daniel Smith > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Fri Mar 13 13:17:39 2015 From: alan.isaac at gmail.com (Alan G Isaac) Date: Fri, 13 Mar 2015 13:17:39 -0400 Subject: [Numpy-discussion] argument handling by uniform In-Reply-To: References: <550308CC.9080105@gmail.com> Message-ID: <55031BB3.2@gmail.com> On 3/13/2015 12:01 PM, Robert Kern wrote: > Roughly equivalent to: > > uni = np.array([ > np.random.uniform(-0.5, 201), > np.random.uniform(0.5, 201), > ]) OK, broadcasting of `low` and `high` is reasonably fun. But is it documented? I was looking at the docstring, which matches the online help: http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.uniform.html#numpy.random.uniform Thanks, Alan Isaac From ndbecker2 at gmail.com Fri Mar 13 13:34:06 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 13 Mar 2015 13:34:06 -0400 Subject: [Numpy-discussion] random.RandomState and deepcopy Message-ID: It is common that to guarantee good statistical independence between various random generators, a singleton instance of an RNG is shared between them. So I typically have various random generator objects, which (sometimes several levels objects deep) embed an instance of RandomState. Now I have a requirement to copy a generator object (without knowing exactly what that generator object is). My solution is to use deepcopy on the top-level object. But I need to overload __deepcopy__ on the singleton RandomState object. Unfortunately, RandomState doesn't allow customization of __deepcopy__ (or anything else). And it has no __dict__. My solution is: class shared_random_state (object): def __init__ (self, rs): self.rs = rs def __getattr__ (self, attr): return getattr (self.rs, attr) def __deepcopy__ (self): return self An example usage: rs = shared_random_state (RandomState(0)) from exponential import exponential e = exponential (rs, 1) where exponential is: class exponential (object): def __init__ (self, rs, mu): self.rs = rs self.mu = mu def __call__ (self, size=None): if size == None: return self.rs.exponential (self.mu, 1)[0] else: return self.rs.exponential (self.mu, size) def __repr__ (self): return 'exp(%s)' % self.mu I wonder if anyone has any other suggestions? Personally, I would prefer if numpy provided a more direct solution to this. Either by providing for overloading RandomState deepcopy, or by making the copy behavior switchable with a flag to the constructor. -- Those who fail to understand recursion are doomed to repeat it From pmhobson at gmail.com Fri Mar 13 13:45:27 2015 From: pmhobson at gmail.com (Paul Hobson) Date: Fri, 13 Mar 2015 10:45:27 -0700 Subject: [Numpy-discussion] [pydata] Pandas v0.16.0 release candidate 1 In-Reply-To: References: Message-ID: Thanks for the all the hard work! Really looking forward to using the `assign` method in long chained statements. -Paul On Fri, Mar 13, 2015 at 8:33 AM, Jeff Reback wrote: > Hi, > > I'm pleased to announce the availability of the first release candidate of > Pandas 0.16.0. > Please try this RC and report any issues here: Pandas Issues > > We will be releasing officially in 1 week or so. > > This is a major release from 0.15.2 and includes a small number of API > changes, several new features, enhancements, and performance improvements > along with a large number of bug fixes. We recommend that all users upgrade > to this version. > > - Highlights include: > - DataFrame.assign method, see *here* > > - Series.to_coo/from_coo methods to interact with scipy.sparse, see > *here* > > - Backwards incompatible change to Timedelta to conform the .seconds attribute > with datetime.timedelta, see *here* > > - Changes to the .loc slicing API to conform with the behavior of > .ix see *here > * > - Changes to the default for ordering in the Categorical constructor, > see *here > * > > > Here are the full whatsnew and documentation links: > v0.16.0 Whatsnew > > > Source tarballs, windows builds, and mac wheels are available here: > > Pandas v0.16.0rc1 Release > > A big thank you to everyone who contributed to this release! > > Jeff > > -- > You received this message because you are subscribed to the Google Groups > "PyData" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pydata+unsubscribe at googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Mar 13 13:45:03 2015 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Mar 2015 17:45:03 +0000 Subject: [Numpy-discussion] random.RandomState and deepcopy In-Reply-To: References: Message-ID: On Fri, Mar 13, 2015 at 5:34 PM, Neal Becker wrote: > > It is common that to guarantee good statistical independence between various > random generators, a singleton instance of an RNG is shared between them. > > So I typically have various random generator objects, which (sometimes > several levels objects deep) embed an instance of RandomState. > > Now I have a requirement to copy a generator object (without knowing exactly > what that generator object is). Or rather, you want the generator object to *avoid* copies by returning itself when a copy is requested of it. > My solution is to use deepcopy on the top-level object. But I need to > overload __deepcopy__ on the singleton RandomState object. > > Unfortunately, RandomState doesn't allow customization of __deepcopy__ (or > anything else). And it has no __dict__. You can always subclass RandomState to override its __deepcopy__. -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Fri Mar 13 13:59:56 2015 From: ndbecker2 at gmail.com (Neal Becker) Date: Fri, 13 Mar 2015 13:59:56 -0400 Subject: [Numpy-discussion] random.RandomState and deepcopy References: Message-ID: Robert Kern wrote: > On Fri, Mar 13, 2015 at 5:34 PM, Neal Becker wrote: >> >> It is common that to guarantee good statistical independence between > various >> random generators, a singleton instance of an RNG is shared between them. >> >> So I typically have various random generator objects, which (sometimes >> several levels objects deep) embed an instance of RandomState. >> >> Now I have a requirement to copy a generator object (without knowing > exactly >> what that generator object is). > > Or rather, you want the generator object to *avoid* copies by returning > itself when a copy is requested of it. > >> My solution is to use deepcopy on the top-level object. But I need to >> overload __deepcopy__ on the singleton RandomState object. >> >> Unfortunately, RandomState doesn't allow customization of __deepcopy__ >> (or >> anything else). And it has no __dict__. > > You can always subclass RandomState to override its __deepcopy__. > > -- > Robert Kern Yes, I think I prefer this: from numpy.random import RandomState class shared_random_state (RandomState): def __init__ (self, rs): RandomState.__init__(self, rs) def __deepcopy__ (self, memo): return self Although, that means I have to use it like this: rs = shared_random_state (0) where I really would prefer (for aesthetic reasons): rs = shared_random_state (RandomState(0)) but I don't know how to do that if shared_random_state inherits from RandomState. -- Those who fail to understand recursion are doomed to repeat it From robert.kern at gmail.com Fri Mar 13 14:05:11 2015 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Mar 2015 18:05:11 +0000 Subject: [Numpy-discussion] random.RandomState and deepcopy In-Reply-To: References: Message-ID: On Fri, Mar 13, 2015 at 5:59 PM, Neal Becker wrote: > > Robert Kern wrote: > > > On Fri, Mar 13, 2015 at 5:34 PM, Neal Becker wrote: > >> > >> It is common that to guarantee good statistical independence between > > various > >> random generators, a singleton instance of an RNG is shared between them. > >> > >> So I typically have various random generator objects, which (sometimes > >> several levels objects deep) embed an instance of RandomState. > >> > >> Now I have a requirement to copy a generator object (without knowing > > exactly > >> what that generator object is). > > > > Or rather, you want the generator object to *avoid* copies by returning > > itself when a copy is requested of it. > > > >> My solution is to use deepcopy on the top-level object. But I need to > >> overload __deepcopy__ on the singleton RandomState object. > >> > >> Unfortunately, RandomState doesn't allow customization of __deepcopy__ > >> (or > >> anything else). And it has no __dict__. > > > > You can always subclass RandomState to override its __deepcopy__. > > > > -- > > Robert Kern > > Yes, I think I prefer this: > > from numpy.random import RandomState > > class shared_random_state (RandomState): > def __init__ (self, rs): > RandomState.__init__(self, rs) > > def __deepcopy__ (self, memo): > return self > > Although, that means I have to use it like this: > > rs = shared_random_state (0) > > where I really would prefer (for aesthetic reasons): > > rs = shared_random_state (RandomState(0)) > > but I don't know how to do that if shared_random_state inherits from > RandomState. If you insist: class shared_random_state(RandomState): def __init__(self, rs): self.__setstate__(rs.__getstate__()) def __deepcopy__(self, memo): return self -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 13 14:43:36 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 13 Mar 2015 11:43:36 -0700 Subject: [Numpy-discussion] Custom __array_interface__ error In-Reply-To: <66A012E1-B993-45CA-AAAF-8B60D1ED6305@icloud.com> References: <66A012E1-B993-45CA-AAAF-8B60D1ED6305@icloud.com> Message-ID: On Mar 13, 2015 7:22 AM, "Daniel Smith" wrote: > > Greetings everyone, > I have a new project that deals with core and disk tensors wrapped into a single object so that the expressions are transparent to the user after the tensor is formed. I would like to add __array_interface__ to the core tensor and provide a reasonable error message if someone tries to call the __array_interface__ for a disk tensor. I may be missing something, but I do not see an obvious way to do this in the python layer. Just define your class so that attempting to access __array_interface__ raises an error directly: class DiskTensor(object): @property def __array_interface__(self): raise TypeError(...) -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From njs at pobox.com Fri Mar 13 15:26:21 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 13 Mar 2015 12:26:21 -0700 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: On Thu, Mar 12, 2015 at 9:35 PM, Benjamin Root wrote: > I think the question is if scalars should be acceptable for the first > argument, not if it should be for the 2nd and 3rd argument. > > If scalar can be given for the first argument, the the first three makes > sense. Although, I have no clue why we would allow that. Why wouldn't we? The where function takes three arguments which are broadcast against each other, so disallowing scalars would require adding a special case. -n -- Nathaniel J. Smith -- http://vorpus.org From njs at pobox.com Fri Mar 13 15:31:17 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 13 Mar 2015 12:31:17 -0700 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: On Thu, Mar 12, 2015 at 5:02 PM, Charles R Harris wrote: > Hi All, > > This is apropos gh-5582 dealing with some corner cases of np.where. The > following are the current behavior > >>>> import numpy >>>> numpy.where(True) # case 1 > ... (array([0]),) >>>> numpy.where(True, None, None) # case 2 > ... array(None, dtype=object) >>>> numpy.ma.where(True) # case 3 > ... (array([0]),) >>>> numpy.ma.where(True, None, None) # case 4 > ... (array([0]),) > > The question is, what exactly should be done in these cases? I'd be inclined > to raise an error for cases 1 and 3. Actually, I forgot about the annoying thing where np.where is two-functions-for-the-price-of-one, and np.where(x) is equivalent to np.nonzero(asarray(x)). That's documented, and is what's happening in cases 1 and 3. So I take back my previous email: I'm -1 on specifically making np.where(scalar) into an error -- we should either consistently stick with the current nonzero() behavior, or consistently remove it. I'd be happy with deprecating np.where(x) entirely and eventually making it an error, since it's just a weird ugly alias for nonzero(). (Case 2 still looks correct to me, and case 4 just looks like an accidental side-effect of someone being lazy and using None to mean "not specified".) -n -- Nathaniel J. Smith -- http://vorpus.org From charlesr.harris at gmail.com Fri Mar 13 16:09:48 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Mar 2015 14:09:48 -0600 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: On Fri, Mar 13, 2015 at 1:26 PM, Nathaniel Smith wrote: > On Thu, Mar 12, 2015 at 9:35 PM, Benjamin Root wrote: > > I think the question is if scalars should be acceptable for the first > > argument, not if it should be for the 2nd and 3rd argument. > > > > If scalar can be given for the first argument, the the first three makes > > sense. Although, I have no clue why we would allow that. > > Why wouldn't we? The where function takes three arguments which are > broadcast against each other, so disallowing scalars would require > adding a special case. > I'm coming to the conclusion that only #4 is incorrect. The process seems to go: cast scalars to 1-D arrays (hence #1 and #3), and indexing results in #2. The oddity is that the second and third arguments are optional, and the action of the function depends on that. I would have made those arguments required as omitting them gives the same as a call to nonzero. But things are as they are... Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri Mar 13 21:07:17 2015 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 13 Mar 2015 19:07:17 -0600 Subject: [Numpy-discussion] Numpy where In-Reply-To: References: Message-ID: On Fri, Mar 13, 2015 at 2:09 PM, Charles R Harris wrote: > > > On Fri, Mar 13, 2015 at 1:26 PM, Nathaniel Smith wrote: > >> On Thu, Mar 12, 2015 at 9:35 PM, Benjamin Root wrote: >> > I think the question is if scalars should be acceptable for the first >> > argument, not if it should be for the 2nd and 3rd argument. >> > >> > If scalar can be given for the first argument, the the first three makes >> > sense. Although, I have no clue why we would allow that. >> >> Why wouldn't we? The where function takes three arguments which are >> broadcast against each other, so disallowing scalars would require >> adding a special case. >> > > I'm coming to the conclusion that only #4 is incorrect. The process seems > to go: cast scalars to 1-D arrays (hence #1 and #3), and indexing results > in #2. > > The oddity is that the second and third arguments are optional, and the > action of the function depends on that. I would have made those arguments > required as omitting them gives the same as a call to nonzero. But things > are as they are... > To summarize, np.where is OK as is, np.ma.where needs fixing. As for deprecating the use of np.where as np.nonzero, that looks very easy to do in the code. However, the PyArray_Where function is in the numpy C-API so we might want to be careful about doing that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From danny_52 at netvision.net.il Sat Mar 14 05:57:23 2015 From: danny_52 at netvision.net.il (Danny Kramer) Date: Sat, 14 Mar 2015 09:57:23 +0000 (UTC) Subject: [Numpy-discussion] Error message Message-ID: Hi, I am getting the following error message: C:\Python27\lib\site-packages\numpy\lib\npyio.py:819: UserWarning: loadtxt: Empty input file: "[]" warnings.warn('loadtxt: Empty input file: "%s"' % fname) main loop list assignment index out of range 1. Why is it happening? 2. How can I fix it? Thanks a lot in advance. Danny From klemm at phys.ethz.ch Sat Mar 14 06:20:30 2015 From: klemm at phys.ethz.ch (Hanno Klemm) Date: Sat, 14 Mar 2015 11:20:30 +0100 Subject: [Numpy-discussion] Error message In-Reply-To: References: Message-ID: > On 14.03.2015, at 10:57, Danny Kramer wrote: > > Hi, > I am getting the following error message: > > C:\Python27\lib\site-packages\numpy\lib\npyio.py:819: UserWarning: loadtxt: > Empty input file: "[]" > warnings.warn('loadtxt: Empty input file: "%s"' % fname) > main loop list assignment index out of range > > 1. Why is it happening? > 2. How can I fix it? > > Thanks a lot in advance. > > Danny > Not seeing the code that produced this warning, we are left to guessing. My guess would be that the error warning tells you already what's going on. You give an empty list as a file name to loadtxt and loadtxt can't find a file of that name to load. Hanno > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From jakirkham at gmail.com Sat Mar 14 19:02:51 2015 From: jakirkham at gmail.com (John Kirkham) Date: Sat, 14 Mar 2015 19:02:51 -0400 Subject: [Numpy-discussion] Fix masked arrays to properly edit views Message-ID: <63F3D076-1AB6-4CAE-8FCF-653883F1E34F@gmail.com> The sample case of the issue ( https://github.com/numpy/numpy/issues/5558 ) is shown below. A proposal to address this behavior can be found here ( https://github.com/numpy/numpy/pull/5580 ). Please give me your feedback. I tried to change the mask of `a` through a subindexed view, but was unable. Using this setup I can reproduce this in the 1.9.1 version of NumPy. import numpy as np a = np.arange(6).reshape(2,3) a = np.ma.masked_array(a, mask=np.ma.getmaskarray(a), shrink=False) b = a[1:2,1:2] c = np.zeros(b.shape, b.dtype) c = np.ma.masked_array(c, mask=np.ma.getmaskarray(c), shrink=False) c[:] = np.ma.masked This yields what one would expect for `a`, `b`, and `c` (seen below). masked_array(data = [[0 1 2] [3 4 5]], mask = [[False False False] [False False False]], fill_value = 999999) masked_array(data = [[4]], mask = [[False]], fill_value = 999999) masked_array(data = [[--]], mask = [[ True]], fill_value = 999999) Now, it would seem reasonable that to copy data into `b` from `c` one can use `__setitem__` (seen below). b[:] = c This results in new data and mask for `b`. masked_array(data = [[--]], mask = [[ True]], fill_value = 999999) This should, in turn, change `a`. However, the mask of `a` remains unchanged (seen below). masked_array(data = [[0 1 2] [3 0 5]], mask = [[False False False] [False False False]], fill_value = 999999) Best, John -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Sat Mar 14 20:01:04 2015 From: efiring at hawaii.edu (Eric Firing) Date: Sat, 14 Mar 2015 14:01:04 -1000 Subject: [Numpy-discussion] Fix masked arrays to properly edit views In-Reply-To: <63F3D076-1AB6-4CAE-8FCF-653883F1E34F@gmail.com> References: <63F3D076-1AB6-4CAE-8FCF-653883F1E34F@gmail.com> Message-ID: <5504CBC0.1080502@hawaii.edu> On 2015/03/14 1:02 PM, John Kirkham wrote: > The sample case of the issue ( > https://github.com/numpy/numpy/issues/5558 ) is shown below. A proposal > to address this behavior can be found here ( > https://github.com/numpy/numpy/pull/5580 ). Please give me your feedback. > > > I tried to change the mask of `a` through a subindexed view, but was > unable. Using this setup I can reproduce this in the 1.9.1 version of NumPy. > > import numpy as np > > a = np.arange(6).reshape(2,3) > a = np.ma.masked_array(a, mask=np.ma.getmaskarray(a), shrink=False) > > b = a[1:2,1:2] > > c = np.zeros(b.shape, b.dtype) > c = np.ma.masked_array(c, mask=np.ma.getmaskarray(c), shrink=False) > c[:] = np.ma.masked > > This yields what one would expect for `a`, `b`, and `c` (seen below). > > masked_array(data = > [[0 1 2] > [3 4 5]], > mask = > [[False False False] > [False False False]], > fill_value = 999999) > > masked_array(data = > [[4]], > mask = > [[False]], > fill_value = 999999) > > masked_array(data = > [[--]], > mask = > [[ True]], > fill_value = 999999) > > Now, it would seem reasonable that to copy data into `b` from `c` one > can use `__setitem__` (seen below). > > b[:] = c > > This results in new data and mask for `b`. > > masked_array(data = > [[--]], > mask = > [[ True]], > fill_value = 999999) > > This should, in turn, change `a`. However, the mask of `a` remains > unchanged (seen below). > > masked_array(data = > [[0 1 2] > [3 0 5]], > mask = > [[False False False] > [False False False]], > fill_value = 999999) > > I agree that this behavior is wrong. A related oddity is this: In [24]: a = np.arange(6).reshape(2,3) In [25]: a = np.ma.array(a, mask=np.ma.getmaskarray(a), shrink=False) In [27]: a.sharedmask True In [28]: a.unshare_mask() In [30]: b = a[1:2, 1:2] In [31]: b[:] = np.ma.masked In [32]: b.sharedmask False In [33]: a masked_array(data = [[0 1 2] [3 -- 5]], mask = [[False False False] [False True False]], fill_value = 999999) It looks like the sharedmask property simply is not being set and interpreted correctly--a freshly initialized array has sharedmask True; and after setting it to False, changing the mask of a new view *does* change the mask in the original. Eric > > Best, > John > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From rmcgibbo at gmail.com Mon Mar 16 00:32:49 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Sun, 15 Mar 2015 21:32:49 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? Message-ID: Hi, Numpy.histogram is implemented in python, and is a little sluggish. This has been discussed previously on the mailing list, [1, 2]. It came up in a project that I maintain, where a new feature is bottlenecked by numpy.histogram, and one developer suggested a faster implementation in cython [3]. Would it make sense to reimplement this function in c? or cython? Is moving functions like this from python to c to improve performance within the scope of the development roadmap for numpy? I started implementing this a little bit in c, [4] but I figured I should check in here first. -Robert [1] http://scipy-user.10969.n7.nabble.com/numpy-histogram-is-slow-td17208.html [2] http://numpy-discussion.10968.n7.nabble.com/Fast-histogram-td9359.html [3] https://github.com/mdtraj/mdtraj/pull/734 [4] https://github.com/rmcgibbo/numpy/tree/histogram -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Mar 16 01:12:40 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Sun, 15 Mar 2015 22:12:40 -0700 Subject: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name? Message-ID: In the past months there have been two proposals for new numpy functions using the name "stack": 1. np.stack for stacking like np.asarray(np.bmat(...)) http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ https://github.com/numpy/numpy/pull/5057 2. np.stack for stacking along an arbitrary new axis (this was my proposal) http://thread.gmane.org/gmane.comp.python.numeric.general/59850/ https://github.com/numpy/numpy/pull/5605 Both functions generalize the notion of stacking arrays from the existing hstack, vstack and dstack, but in two very different ways. Both could be useful -- but we can only call one "stack". Which one deserves that name? The existing *stack functions use the word "stack" to refer to combining arrays in two similarly different ways: a. For ND -> ND stacking along an existing dimensions (like numpy.concatenate and proposal 1) b. For ND -> (N+1)D stacking along new dimensions (like proposal 2). I think it would be much cleaner API design if we had different words to denote these two different operations. Concatenate for "combine along an existing dimension" already exists, so my thought (when I wrote proposal 2), was that the verb "stack" could be reserved (going forward) for "combine along a new dimension." This also has the advantage of suggesting that "concatenate" and "stack" are the two fundamental operations for combining N-dimensional arrays. The documentation on this is currently quite confusing, mostly because no function like that in proposal 2 currently exists. Of course, the *stack functions have existed for quite some time, and in many cases vstack and hstack are indeed used for concatenate like functionality (e.g., whenever they are used for 2D arrays/matrices). So the case is not entirely clear-cut. (We'll never be able to remove this functionality from NumPy.) In any case, I would appreciate your thoughts. Best, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Mar 16 02:00:33 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Sun, 15 Mar 2015 23:00:33 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon wrote: > Hi, > > Numpy.histogram is implemented in python, and is a little sluggish. This > has been discussed previously on the mailing list, [1, 2]. It came up in a > project that I maintain, where a new feature is bottlenecked by > numpy.histogram, and one developer suggested a faster implementation in > cython [3]. > > Would it make sense to reimplement this function in c? or cython? Is > moving functions like this from python to c to improve performance within > the scope of the development roadmap for numpy? I started implementing this > a little bit in c, [4] but I figured I should check in here first. > Where do you think the performance gains will come from? The PR in your project that claims a 10x speed-up uses a method that is only fit for equally spaced bins. I want to think that implementing that exact same algorithm in Python with NumPy would be comparably fast, say within 2x. For the general case, NumPy is already doing most of the heavy lifting (the sorting and the searching) in C: simply replicating the same algorithmic approach entirely in C is unlikely to provide any major speed-up. And if the change is to the algorithm, then we should first try it out in Python. That said, if you can speed things up 10x, I don't think there is going to be much opposition to moving it to C! Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Mon Mar 16 02:06:43 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Sun, 15 Mar 2015 23:06:43 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: Message-ID: It might make sense to dispatch to difference c implements if the bins are equally spaced (as created by using an integer for the np.histogram bins argument), vs. non-equally-spaced bins. In that case, getting the bigger speedup may be easier, at least for one common use case. -Robert On Sun, Mar 15, 2015 at 11:00 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon > wrote: > >> Hi, >> >> Numpy.histogram is implemented in python, and is a little sluggish. This >> has been discussed previously on the mailing list, [1, 2]. It came up in a >> project that I maintain, where a new feature is bottlenecked by >> numpy.histogram, and one developer suggested a faster implementation in >> cython [3]. >> >> Would it make sense to reimplement this function in c? or cython? Is >> moving functions like this from python to c to improve performance within >> the scope of the development roadmap for numpy? I started implementing this >> a little bit in c, [4] but I figured I should check in here first. >> > > Where do you think the performance gains will come from? The PR in your > project that claims a 10x speed-up uses a method that is only fit for > equally spaced bins. I want to think that implementing that exact same > algorithm in Python with NumPy would be comparably fast, say within 2x. > > For the general case, NumPy is already doing most of the heavy lifting > (the sorting and the searching) in C: simply replicating the same > algorithmic approach entirely in C is unlikely to provide any major > speed-up. And if the change is to the algorithm, then we should first try > it out in Python. > > That said, if you can speed things up 10x, I don't think there is going to > be much opposition to moving it to C! > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Mon Mar 16 02:19:59 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Sun, 15 Mar 2015 23:19:59 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: Message-ID: My apologies for the typo: 'implements' -> 'implementations' -Robert On Sun, Mar 15, 2015 at 11:06 PM, Robert McGibbon wrote: > It might make sense to dispatch to difference c implements if the bins are > equally spaced (as created by using an integer for the np.histogram bins > argument), vs. non-equally-spaced bins. > > In that case, getting the bigger speedup may be easier, at least for one > common use case. > > -Robert > > On Sun, Mar 15, 2015 at 11:00 PM, Jaime Fern?ndez del R?o < > jaime.frio at gmail.com> wrote: > >> On Sun, Mar 15, 2015 at 9:32 PM, Robert McGibbon >> wrote: >> >>> Hi, >>> >>> Numpy.histogram is implemented in python, and is a little sluggish. This >>> has been discussed previously on the mailing list, [1, 2]. It came up in a >>> project that I maintain, where a new feature is bottlenecked by >>> numpy.histogram, and one developer suggested a faster implementation in >>> cython [3]. >>> >>> Would it make sense to reimplement this function in c? or cython? Is >>> moving functions like this from python to c to improve performance within >>> the scope of the development roadmap for numpy? I started implementing this >>> a little bit in c, [4] but I figured I should check in here first. >>> >> >> Where do you think the performance gains will come from? The PR in your >> project that claims a 10x speed-up uses a method that is only fit for >> equally spaced bins. I want to think that implementing that exact same >> algorithm in Python with NumPy would be comparably fast, say within 2x. >> >> For the general case, NumPy is already doing most of the heavy lifting >> (the sorting and the searching) in C: simply replicating the same >> algorithmic approach entirely in C is unlikely to provide any major >> speed-up. And if the change is to the algorithm, then we should first try >> it out in Python. >> >> That said, if you can speed things up 10x, I don't think there is going >> to be much opposition to moving it to C! >> >> Jaime >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes >> de dominaci?n mundial. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.otte at gmail.com Mon Mar 16 04:50:34 2015 From: stefan.otte at gmail.com (Stefan Otte) Date: Mon, 16 Mar 2015 09:50:34 +0100 Subject: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name? In-Reply-To: References: Message-ID: Hey, > 1. np.stack for stacking like np.asarray(np.bmat(...)) > http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ > https://github.com/numpy/numpy/pull/5057 I'm the author of this proposal. I'll just give some context real quickly. "My stack" started really simple, basically allowing a Matlab-like notation for stacking: matlab: [ a b; c d ] numpy: stack([[a, b], [c, d]]) or even stack([a, b], [c, d]) where a, b, c, and d a arrays. During the discussion people asked for fancier stacking and auto filling of non explicitly set blocks (think of an "eye" matrix where only certain blocks are set). Alternatively, we thought of refactoring the core of bmat [2] so that it can be used with arrays and matrices. This would allow stack("a b; c d") where a, b, c, and d are the names of arrays/matrices. (Also bmat would get better documentation during the refactoring :)). Summarizing, my proposal is mostly concerned how to create block arrays from given arrays. I don't care about the name "stack". I just used "stack" because it replaced hstack/vstack for me. Maybe "bstack" for block stack, or "barray" for block array? I have the feeling [1] that my use case is more common, but I like the second proposal. Cheers, Stefan [1] Everybody generalizes from oneself. At least I do. [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.bmat.html From njs at pobox.com Mon Mar 16 04:57:52 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 16 Mar 2015 01:57:52 -0700 Subject: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name? In-Reply-To: References: Message-ID: We already use the word "stack" in lots of function names to refer to something different from what bmat does. So while I definitely agree we should have something like bmat for ndarrays, it would be better all the to just pick a different name. np.block, even, might do the job. On Mar 16, 2015 1:50 AM, "Stefan Otte" wrote: > > Hey, > > > 1. np.stack for stacking like np.asarray(np.bmat(...)) > > http://thread.gmane.org/gmane.comp.python.numeric.general/58748/ > > https://github.com/numpy/numpy/pull/5057 > > I'm the author of this proposal. I'll just give some context real quickly. > > "My stack" started really simple, basically allowing a Matlab-like > notation for stacking: > > matlab: [ a b; c d ] > numpy: stack([[a, b], [c, d]]) or even stack([a, b], [c, d]) > > where a, b, c, and d a arrays. > > During the discussion people asked for fancier stacking and auto > filling of non explicitly set blocks (think of an "eye" matrix where > only certain blocks are set). > > Alternatively, we thought of refactoring the core of bmat [2] so that > it can be used with arrays and matrices. This would allow stack("a b; > c d") where a, b, c, and d are the names of arrays/matrices. (Also > bmat would get better documentation during the refactoring :)). > > Summarizing, my proposal is mostly concerned how to create block > arrays from given arrays. I don't care about the name "stack". I just > used "stack" because it replaced hstack/vstack for me. Maybe "bstack" > for block stack, or "barray" for block array? > > I have the feeling [1] that my use case is more common, but I like the > second proposal. > > > Cheers, > Stefan > > > [1] Everybody generalizes from oneself. At least I do. > [2] http://docs.scipy.org/doc/numpy/reference/generated/numpy.bmat.html > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Mon Mar 16 09:56:58 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 16 Mar 2015 06:56:58 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: Message-ID: On Sun, Mar 15, 2015 at 11:06 PM, Robert McGibbon wrote: > It might make sense to dispatch to difference c implements if the bins are > equally spaced (as created by using an integer for the np.histogram bins > argument), vs. non-equally-spaced bins. > Dispatching to a different method seems like a no brainer indeed. The question is whether we really need to do this in C. Maybe for some very specific case or cases it makes sense to have a super fast C path, e,g. no weights and bins is an integer. Even then, rather than rewriting the whole thing in C, it may be a better idea to leave the parsing of the inputs in Python, and have a C helper function wrapped and privately exposed, similarly to how `np.core.multiarray.interp` is used by `np.interp`. But I would still first give it a try in Python... Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Mon Mar 16 11:53:08 2015 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Mon, 16 Mar 2015 15:53:08 +0000 (UTC) Subject: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis Message-ID: I have a number of large arrays for which I want to compute the mean and standard deviation over a particular axis - e.g. I want to compute the statistics for axis=1 as if the other axes were combined so that in the example below I get two values back In [1]: a = randn(30, 2, 10000) For the mean this can be done easily like: In [2]: a.mean(0).mean(-1) Out[2]: array([ 0.0007, -0.0009]) ...but this won't work for the std. Using some transformations we can come up with something which will work for either: In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) Out[3]: array([ 0.0007, -0.0009]) In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) Out[4]: array([ 0.0007, -0.0009]) If we look at the performance of these equivalent methods: In [5]: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) 100 loops, best of 3: 14.5 ms per loop In [6]: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) 100 loops, best of 3: 5.05 ms per loop we can see that the latter version is a clear winner. Investigating further, both methods appear to copy the data so the performance is likely down to better cache utilisation. In [7]: np.may_share_memory(a, a.transpose(2,0,1).reshape(-1, 2)) Out[7]: False In [8]: np.may_share_memory(a, a.transpose(1,0,2).reshape(2, -1)) Out[8]: False Both methods are however significantly slower than the initial attempt: In [9]: %timeit a.mean(0).mean(-1) 1000 loops, best of 3: 1.2 ms per loop Perhaps because it allocates a smaller temporary? For those who like a challenge: is there a faster way to achieve what I'm after? Cheers, Dave From oscar.j.benjamin at gmail.com Mon Mar 16 12:04:30 2015 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Mon, 16 Mar 2015 16:04:30 +0000 Subject: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis In-Reply-To: References: Message-ID: On 16 March 2015 at 15:53, Dave Hirschfeld wrote: > I have a number of large arrays for which I want to compute the mean and > standard deviation over a particular axis - e.g. I want to compute the > statistics for axis=1 as if the other axes were combined so that in the > example below I get two values back > > In [1]: a = randn(30, 2, 10000) > ... > Both methods are however significantly slower than the initial attempt: > > In [9]: %timeit a.mean(0).mean(-1) > 1000 loops, best of 3: 1.2 ms per loop > > Perhaps because it allocates a smaller temporary? > > For those who like a challenge: is there a faster way to achieve what > I'm after? You'll probably find it faster if you swap the means around to make an even smaller temporary: a.mean(-1).mean(0) Oscar From ewm at redtetrahedron.org Mon Mar 16 12:10:21 2015 From: ewm at redtetrahedron.org (Eric Moore) Date: Mon, 16 Mar 2015 12:10:21 -0400 Subject: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis In-Reply-To: References: Message-ID: On Mon, Mar 16, 2015 at 11:53 AM, Dave Hirschfeld wrote: > I have a number of large arrays for which I want to compute the mean and > standard deviation over a particular axis - e.g. I want to compute the > statistics for axis=1 as if the other axes were combined so that in the > example below I get two values back > > In [1]: a = randn(30, 2, 10000) > > For the mean this can be done easily like: > > In [2]: a.mean(0).mean(-1) > Out[2]: array([ 0.0007, -0.0009]) > > > ...but this won't work for the std. Using some transformations we can > come up with something which will work for either: > > In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) > Out[3]: array([ 0.0007, -0.0009]) > > In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) > Out[4]: array([ 0.0007, -0.0009]) > > Specify all of the axes you want to reduce over as a tuple. In [1]: import numpy as np In [2]: a = np.random.randn(30, 2, 10000) In [3]: a.mean(axis=(0,-1)) Out[3]: array([-0.00224589, 0.00230759]) In [4]: a.std(axis=(0,-1)) Out[4]: array([ 1.00062771, 1.0001258 ]) -Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Mon Mar 16 12:12:56 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Mon, 16 Mar 2015 17:12:56 +0100 Subject: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis In-Reply-To: References: Message-ID: <1426522376.17269.6.camel@sipsolutions.net> On Mo, 2015-03-16 at 15:53 +0000, Dave Hirschfeld wrote: > I have a number of large arrays for which I want to compute the mean and > standard deviation over a particular axis - e.g. I want to compute the > statistics for axis=1 as if the other axes were combined so that in the > example below I get two values back > > In [1]: a = randn(30, 2, 10000) > > For the mean this can be done easily like: > > In [2]: a.mean(0).mean(-1) > Out[2]: array([ 0.0007, -0.0009]) > If you have numpy 1.7+ (which I guess by now is basically always the case), you can do a.mean((0, 1)). Though it isn't actually faster in this example, probably because it has to use buffered iterators and things, but I would guess the performance should be much more stable depending on memory order, etc. then any other method. - Sebastian > > ...but this won't work for the std. Using some transformations we can > come up with something which will work for either: > > In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) > Out[3]: array([ 0.0007, -0.0009]) > > In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) > Out[4]: array([ 0.0007, -0.0009]) > > > If we look at the performance of these equivalent methods: > > In [5]: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) > 100 loops, best of 3: 14.5 ms per loop > > In [6]: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) > 100 loops, best of 3: 5.05 ms per loop > > > we can see that the latter version is a clear winner. Investigating > further, both methods appear to copy the data so the performance is > likely down to better cache utilisation. > > In [7]: np.may_share_memory(a, a.transpose(2,0,1).reshape(-1, 2)) > Out[7]: False > > In [8]: np.may_share_memory(a, a.transpose(1,0,2).reshape(2, -1)) > Out[8]: False > > > Both methods are however significantly slower than the initial attempt: > > In [9]: %timeit a.mean(0).mean(-1) > 1000 loops, best of 3: 1.2 ms per loop > > Perhaps because it allocates a smaller temporary? > > For those who like a challenge: is there a faster way to achieve what > I'm after? > > > Cheers, > Dave > > > > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From Jerome.Kieffer at esrf.fr Mon Mar 16 12:28:48 2015 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Mon, 16 Mar 2015 17:28:48 +0100 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: Message-ID: <20150316172848.421a2f6f@lintaillefer.esrf.fr> On Mon, 16 Mar 2015 06:56:58 -0700 Jaime Fern?ndez del R?o wrote: > Dispatching to a different method seems like a no brainer indeed. The > question is whether we really need to do this in C. I need to do both unweighted & weighted histograms and we got a factor 5 using (simple) cython: it is in the proceedings of Euroscipy, last year. http://arxiv.org/pdf/1412.6367.pdf We got much faster but that's another story. In fact, many people coming from IDL or Matlab are surprised by the poor performances of numpy's histogram. Cheers -- J?r?me Kieffer tel +33 476 882 445 From jaime.frio at gmail.com Mon Mar 16 14:19:48 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 16 Mar 2015 11:19:48 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: <20150316172848.421a2f6f@lintaillefer.esrf.fr> References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer wrote: > On Mon, 16 Mar 2015 06:56:58 -0700 > Jaime Fern?ndez del R?o wrote: > > > Dispatching to a different method seems like a no brainer indeed. The > > question is whether we really need to do this in C. > > I need to do both unweighted & weighted histograms and we got a factor 5 > using (simple) cython: > it is in the proceedings of Euroscipy, last year. > http://arxiv.org/pdf/1412.6367.pdf If I read your paper and code properly, you got 5x faster, mostly because you combined the weighted and unweighted histograms into a single search of the array, and because you used an algorithm that can only be applied to equal- sized bins, similarly to the 10x speed-up Robert was reporting. I think that having a special path for equal sized bins is a great idea: let's do it, PRs are always welcome! Similarly, getting the counts together with the weights seems like a very good idea. I also think that writing it in Python is going to take us 80% of the way there: most of the improvements both of you have reported are not likely to be coming from the language chosen, but from the algorithm used. And if C proves to be sufficiently faster to warrant using it, it should be confined to the number crunching: I don;t think there is any point in rewriting argument parsing in C. Also, keep in mind `np.histogram` can now handle arrays of just about **any** dtype. Handling that complexity in C is not a ride in the park. Other functions like `np.bincount` and `np.digitize` cheat by only handling `double` typed arrays, a luxury that histogram probably can't afford at this point in time. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Mar 16 14:27:29 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 16 Mar 2015 11:27:29 -0700 Subject: [Numpy-discussion] numpy.stack -- which function, if any, deserves the name? In-Reply-To: References: Message-ID: On Mon, Mar 16, 2015 at 1:50 AM, Stefan Otte wrote: > Summarizing, my proposal is mostly concerned how to create block > arrays from given arrays. I don't care about the name "stack". I just > used "stack" because it replaced hstack/vstack for me. Maybe "bstack" > for block stack, or "barray" for block array? > Stefan -- thanks for sharing your perspective! In conclusion, it sounds like we could safely use "stack" for my PR (proposal 2), and use another name (perhaps "block", "barray" or "block_array") for your proposal. I'm also not opposed to using a new verb for my PR (the stacking alternative to "concatenate"), but I haven't come up with any more descriptive alternatives. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmcgibbo at gmail.com Mon Mar 16 14:35:45 2015 From: rmcgibbo at gmail.com (Robert McGibbon) Date: Mon, 16 Mar 2015 11:35:45 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: Hi, It sounds like putting together a PR makes sense then. I'll try hacking on this a bit. -Robert On Mar 16, 2015 11:20 AM, "Jaime Fern?ndez del R?o" wrote: > On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer > wrote: > >> On Mon, 16 Mar 2015 06:56:58 -0700 >> Jaime Fern?ndez del R?o wrote: >> >> > Dispatching to a different method seems like a no brainer indeed. The >> > question is whether we really need to do this in C. >> >> I need to do both unweighted & weighted histograms and we got a factor 5 >> using (simple) cython: >> it is in the proceedings of Euroscipy, last year. >> http://arxiv.org/pdf/1412.6367.pdf > > > If I read your paper and code properly, you got 5x faster, mostly because > you combined the weighted and unweighted histograms into a single search of > the array, and because you used an algorithm that can only be applied to > equal- sized bins, similarly to the 10x speed-up Robert was reporting. > > I think that having a special path for equal sized bins is a great idea: > let's do it, PRs are always welcome! > Similarly, getting the counts together with the weights seems like a very > good idea. > > I also think that writing it in Python is going to take us 80% of the way > there: most of the improvements both of you have reported are not likely to > be coming from the language chosen, but from the algorithm used. And if C > proves to be sufficiently faster to warrant using it, it should be confined > to the number crunching: I don;t think there is any point in rewriting > argument parsing in C. > > Also, keep in mind `np.histogram` can now handle arrays of just about > **any** dtype. Handling that complexity in C is not a ride in the park. > Other functions like `np.bincount` and `np.digitize` cheat by only handling > `double` typed arrays, a luxury that histogram probably can't afford at > this point in time. > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dieter.van.eessen at gmail.com Tue Mar 17 04:11:23 2015 From: dieter.van.eessen at gmail.com (Dieter Van Eessen) Date: Tue, 17 Mar 2015 09:11:23 +0100 Subject: [Numpy-discussion] 3D array and the right hand rule Message-ID: Hello, Sorry to disturb again, but the topic still bugs me somehow... I'll try to rephrase the question: - What's the influence of the type of N-array representation with respect to TENSOR-calculus? - Are multiple representations possible? - I assume that the order of the dimensions plays a major role in for example TENSOR product. Is this assumption correct? As I said before, my math skills are lacking in this area... I hope you consider this a valid question. kind regards, Dieter On Fri, Jan 30, 2015 at 2:32 AM, Alexander Belopolsky wrote: > > On Mon, Jan 26, 2015 at 6:06 AM, Dieter Van Eessen < > dieter.van.eessen at gmail.com> wrote: > >> I've read that numpy.array isn't arranged according to the >> 'right-hand-rule' (right-hand-rule => thumb = +x; index finger = +y, bend >> middle finder = +z). This is also confirmed by an old message I dug up from >> the mailing list archives. (see message below) >> > > Dieter, > > It looks like you are confusing dimensionality of the array with the > dimensionality of a vector that it might store. If you are interested in > using numpy for 3D modeling, you will likely only encounter 1-dimensional > arrays (vectors) of size 3 and 2-dimensional arrays (matrices) of size 9 > or shape (3, 3). > > A 3-dimensional array is a stack of matrices and the 'right-hand-rule' > does not really apply. The notion of C/F-contiguous deals with the order > of axes (e.g. width first or depth first) while the right-hand-rule is > about the direction of the axes (if you "flip" the middle finger right hand > becomes left.) In the case of arrays this would probably correspond to > little-endian vs. big-endian: is a[0] stored at a higher or lower address > than a[1]. However, whatever the answer to this question is for a > particular system, it is the same for all axes in the array, so right-hand > - left-hand distinction does not apply. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- gtz, Dieter VE -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.hirschfeld at gmail.com Tue Mar 17 07:41:37 2015 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Tue, 17 Mar 2015 11:41:37 +0000 (UTC) Subject: [Numpy-discussion] Fastest way to compute summary statistics for a specific axis References: <1426522376.17269.6.camel@sipsolutions.net> Message-ID: Sebastian Berg sipsolutions.net> writes: > > On Mo, 2015-03-16 at 15:53 +0000, Dave Hirschfeld wrote: > > I have a number of large arrays for which I want to compute the mean and > > standard deviation over a particular axis - e.g. I want to compute the > > statistics for axis=1 as if the other axes were combined so that in the > > example below I get two values back > > > > In [1]: a = randn(30, 2, 10000) > > > > For the mean this can be done easily like: > > > > In [2]: a.mean(0).mean(-1) > > Out[2]: array([ 0.0007, -0.0009]) > > > > If you have numpy 1.7+ (which I guess by now is basically always the > case), you can do a.mean((0, 1)). Though it isn't actually faster in > this example, probably because it has to use buffered iterators and > things, but I would guess the performance should be much more stable > depending on memory order, etc. then any other method. > > - Sebastian > Wow, I didn't know you could even do that - that's very cool (and a lot cleaner than manually reordering & reshaping) It seems to be pretty fast for me and reasonably stable wrt memory order: In [199]: %timeit a.mean(0).mean(-1) ...: %timeit a.mean(axis=(0,2)) ...: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) ...: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) ...: 1000 loops, best of 3: 1.52 ms per loop 1000 loops, best of 3: 1.5 ms per loop 100 loops, best of 3: 4.8 ms per loop 100 loops, best of 3: 14.6 ms per loop In [200]: a = a.copy('F') In [201]: %timeit a.mean(0).mean(-1) ...: %timeit a.mean(axis=(0,2)) ...: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) ...: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) 100 loops, best of 3: 2.02 ms per loop 100 loops, best of 3: 3.29 ms per loop 100 loops, best of 3: 7.18 ms per loop 100 loops, best of 3: 15.9 ms per loop Thanks, Dave From allanhaldane at gmail.com Tue Mar 17 12:45:36 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 17 Mar 2015 12:45:36 -0400 Subject: [Numpy-discussion] should views into structured arrays be reversible? Message-ID: <55085A30.1060604@gmail.com> Hello all, I've introduced PR 5548 which, through more careful safety checks, allows views of object arrays. However, I had to make 'partial views' into structured arrays irreversible, and I want to check with the list that that's ok. With the PR, if you only view certain fields of an array you cannot take a 'reverse' view of the resulting object to get back the original array: >>> arr = np.array([(1,2),(4,5)], dtype=[('A', 'i'), ('B', 'i')]) >>> varr = arr.view({'names': ['A'], 'formats': ['i'], ... 'itemsize': arr.dtype.itemsize}) >>> varr.view(arr.dtype) TypeError: view would access data parent array doesn't own Ie., with this PR you can only take views into parts of an array that have fields. This was necessary in order to guarantee that we never interpret memory containing a python Object as another type, which could cause a segfault. I have a more extensive discussion & motivation in the PR, including an alternative idea. So does this limitation seem reasonable? Cheers, Allan From mshubhankar at yahoo.co.in Tue Mar 17 14:00:35 2015 From: mshubhankar at yahoo.co.in (Shubhankar Mohapatra) Date: Tue, 17 Mar 2015 18:00:35 +0000 (UTC) Subject: [Numpy-discussion] Mathematical functions in Numpy Message-ID: <2000837013.506068.1426615235275.JavaMail.yahoo@mail.yahoo.com> Hello all,I am a undergraduate and i am trying to do a project this time on numppy in gsoc. This project is about integrating vector math library classes of sleef and yeppp into numpy to make the mathematical functions faster. I have already studied the new library classes but i am unable to find the sin , cos function definitions in the numpy souce code.Can someone please help me find the functions in the source code so that i can implement the new library class into numpy.Thanking you,Shubhankar Mohapatra -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthieu.brucher at gmail.com Tue Mar 17 14:29:56 2015 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 17 Mar 2015 18:29:56 +0000 Subject: [Numpy-discussion] Mathematical functions in Numpy In-Reply-To: <2000837013.506068.1426615235275.JavaMail.yahoo@mail.yahoo.com> References: <2000837013.506068.1426615235275.JavaMail.yahoo@mail.yahoo.com> Message-ID: Hi, These functions are defined in the C standard library! Cheers, Matthieu 2015-03-17 18:00 GMT+00:00 Shubhankar Mohapatra : > Hello all, > I am a undergraduate and i am trying to do a project this time on numppy in > gsoc. This project is about integrating vector math library classes of sleef > and yeppp into numpy to make the mathematical functions faster. I have > already studied the new library classes but i am unable to find the sin , > cos function definitions in the numpy souce code.Can someone please help me > find the functions in the source code so that i can implement the new > library class into numpy. > Thanking you, > Shubhankar Mohapatra > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ From jbednar at inf.ed.ac.uk Tue Mar 17 14:37:44 2015 From: jbednar at inf.ed.ac.uk (James A. Bednar) Date: Tue, 17 Mar 2015 18:37:44 +0000 Subject: [Numpy-discussion] ANN: HoloViews 1.0 released In-Reply-To: <21767.24743.332660.885819@hebb.inf.ed.ac.uk> References: <21764.49404.322423.198518@hebb.inf.ed.ac.uk> <5504D19F.3070206@ed.ac.uk> <21765.26302.991473.772921@hebb.inf.ed.ac.uk> <21767.14083.59177.198073@hebb.inf.ed.ac.uk> <1426544487.2728367.241229649.23B94496@webmail.messagingengine.com> <21767.24743.332660.885819@hebb.inf.ed.ac.uk> Message-ID: <21768.29816.377872.958413@hebb.inf.ed.ac.uk> We are pleased to announce the first public release of HoloViews, a Python package for scientific and engineering data visualization: http://ioam.github.io/holoviews HoloViews provides composable, sliceable, declarative data structures for building even complex visualizations easily. It's designed to exploit the rich ecosystem of scientific Python tools already available, using Numpy for data storage, matplotlib and mpld3 as plotting backends, and integrating fully with IPython Notebook to make your data instantly visible. If you look at the website for just about any other visualization package, you'll see a long list of pretty pictures, each one of which has a page or two of code putting it together. There are pretty pictures in HoloViews too, but there is *no* hidden code -- *all* of the steps needed to build a given figure are shown right before the HoloViews plot, with just a few lines needed for nearly all of our examples, even complex multi-figure subplots and animations. This concise but flexible specification makes it practical to explore and analyze your data interactively, while leaving a full record for later reproducibility in the notebook. It may sound like magic, but it's not -- HoloViews simply lets you annotate your data with appropriate metadata, and then the data can display itself! HoloViews provides a set of general, compositional, multidimensional data structures suitable for both discrete and continuous real-world data, and pairs them with separate customizable plotting classes to visualize them without extensive coding. An large collection of continuously tested IPython Notebook tutorials accompanies HoloViews, showing you precisely the small number of steps required to generate any of the plots. Some of the most important features: - Freely available under a BSD license - Python 2 and 3 compatible - Minimal external dependencies -- easy to integrate into your workflow - Builds figures by slicing, sampling, and composing your data - Builds web-embeddable animations without any extra coding - Easily customizable without obscuring the underlying data objects - Includes interfaces to pandas and Seaborn - Winner of the 2015 UK Open Source Award For the rest, check out ioam.github.io/holoviews! Jean-Luc Stevens Philipp Rudiger James A. Bednar The University of Edinburgh School of Informatics -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From robert.kern at gmail.com Tue Mar 17 14:51:49 2015 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 17 Mar 2015 18:51:49 +0000 Subject: [Numpy-discussion] Mathematical functions in Numpy In-Reply-To: References: <2000837013.506068.1426615235275.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Tue, Mar 17, 2015 at 6:29 PM, Matthieu Brucher < matthieu.brucher at gmail.com> wrote: > > Hi, > > These functions are defined in the C standard library! I think he's asking how to define numpy ufuncs. > 2015-03-17 18:00 GMT+00:00 Shubhankar Mohapatra : > > Hello all, > > I am a undergraduate and i am trying to do a project this time on numppy in > > gsoc. This project is about integrating vector math library classes of sleef > > and yeppp into numpy to make the mathematical functions faster. I have > > already studied the new library classes but i am unable to find the sin , > > cos function definitions in the numpy souce code.Can someone please help me > > find the functions in the source code so that i can implement the new > > library class into numpy. > > Thanking you, > > Shubhankar Mohapatra > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > > -- > Information System Engineer, Ph.D. > Blog: http://matt.eifelle.com > LinkedIn: http://www.linkedin.com/in/matthieubrucher > Music band: http://liliejay.com/ > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Robert Kern -------------- next part -------------- An HTML attachment was scrubbed... URL: From klemm at phys.ethz.ch Tue Mar 17 15:09:37 2015 From: klemm at phys.ethz.ch (Hanno Klemm) Date: Tue, 17 Mar 2015 20:09:37 +0100 Subject: [Numpy-discussion] 3D array and the right hand rule In-Reply-To: References: Message-ID: > On 17 Mar 2015, at 09:11, Dieter Van Eessen wrote: > > Hello, > > Sorry to disturb again, but the topic still bugs me somehow... > I'll try to rephrase the question: > > - What's the influence of the type of N-array representation with respect to TENSOR-calculus? > - Are multiple representations possible? > - I assume that the order of the dimensions plays a major role in for example TENSOR product. > Is this assumption correct? > > As I said before, my math skills are lacking in this area... > I hope you consider this a valid question. > > kind regards, > Dieter > > > On Fri, Jan 30, 2015 at 2:32 AM, Alexander Belopolsky wrote: > > On Mon, Jan 26, 2015 at 6:06 AM, Dieter Van Eessen wrote: > I've read that numpy.array isn't arranged according to the 'right-hand-rule' (right-hand-rule => thumb = +x; index finger = +y, bend middle finder = +z). This is also confirmed by an old message I dug up from the mailing list archives. (see message below) > > Dieter, > > It looks like you are confusing dimensionality of the array with the dimensionality of a vector that it might store. If you are interested in using numpy for 3D modeling, you will likely only encounter 1-dimensional arrays (vectors) of size 3 and 2-dimensional arrays (matrices) of size 9 or shape (3, 3). > > A 3-dimensional array is a stack of matrices and the 'right-hand-rule' does not really apply. The notion of C/F-contiguous deals with the order of axes (e.g. width first or depth first) while the right-hand-rule is about the direction of the axes (if you "flip" the middle finger right hand becomes left.) In the case of arrays this would probably correspond to little-endian vs. big-endian: is a[0] stored at a higher or lower address than a[1]. However, whatever the answer to this question is for a particular system, it is the same for all axes in the array, so right-hand - left-hand distinction does not apply. > Hi, let us say you have an n-dimensional rank k tensor. Then you could represent that beast as a numpy array of dimension k*n. So, if n is 4 and k is 2 you have a 4D tensor of rank 2. This tensor would be an array of shape (4,4). If you want to have higher order tensors you have higher dimensional arrays, a tensor of rank 3 would be an array of shape (4,4,4) in this example. Of course you have to be careful over which axes you do tensor operations. Something like V_ik = C_ijk v_j would then mean that you want to sum over the first axis of the 3-dimensional array C (of shape (4,4,4) and along the zero-axis of the 1-dimensional array v_j (of shape (4,)). By the way, of you think about doing those things, np.einsum might help. Of course, knowing which axis in the numpy array corresponds to which axis (or index) in your tensor is quite important. Hanno From jtaylor.debian at googlemail.com Tue Mar 17 16:01:18 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Tue, 17 Mar 2015 21:01:18 +0100 Subject: [Numpy-discussion] Mathematical functions in Numpy In-Reply-To: References: <2000837013.506068.1426615235275.JavaMail.yahoo@mail.yahoo.com> Message-ID: <5508880E.6050406@googlemail.com> currently the math functions are wrapped via the generic PyUfunc_* functions in numpy/core/src/umath/loops.c.src which just apply some arbitrary function to a scalar from arbitrarily strided inputs. When adding variants one likely needs to add some special purpose loops to deal with the various special requirements of the vector math api's. This involves adding some special cases to the ufunc generation in numpy/core/code_generators/generate_umath.py and then implementing the new kernel functions. See e.g. this oldish PR, which changes the sqrt function from a PyUfunc_d_d function to a special loop to take advantage of the vectorized machine instructions: https://github.com/numpy/numpy/pull/3341 some things have changed a bit since then but it does show many of the files you probably need to look for this project. On 17.03.2015 19:51, Robert Kern wrote: > On Tue, Mar 17, 2015 at 6:29 PM, Matthieu Brucher > > wrote: >> >> Hi, >> >> These functions are defined in the C standard library! > > I think he's asking how to define numpy ufuncs. > >> 2015-03-17 18:00 GMT+00:00 Shubhankar Mohapatra > >: >> > Hello all, >> > I am a undergraduate and i am trying to do a project this time on > numppy in >> > gsoc. This project is about integrating vector math library classes > of sleef >> > and yeppp into numpy to make the mathematical functions faster. I have >> > already studied the new library classes but i am unable to find the > sin , >> > cos function definitions in the numpy souce code.Can someone please > help me >> > find the functions in the source code so that i can implement the new >> > library class into numpy. >> > Thanking you, >> > Shubhankar Mohapatra >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> >> >> -- >> Information System Engineer, Ph.D. >> Blog: http://matt.eifelle.com >> LinkedIn: http://www.linkedin.com/in/matthieubrucher >> Music band: http://liliejay.com/ >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Robert Kern > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From cgodshall at enthought.com Wed Mar 18 14:30:14 2015 From: cgodshall at enthought.com (Courtenay Godshall (Enthought)) Date: Wed, 18 Mar 2015 13:30:14 -0500 Subject: [Numpy-discussion] SciPy 2015 Call for Propsals Open - tutorial & talk submissions due April 1st In-Reply-To: <00ea01d0619c$c81583b0$58408b10$@enthought.com> References: <00ea01d0619c$c81583b0$58408b10$@enthought.com> Message-ID: <012c01d061a9$95928840$c0b798c0$@enthought.com> **SciPy 2015 Conference (Scientific Computing with Python) Call for Proposals: Submit Your Tutorial and Talk Ideas by April 1, 2015 at http://scipy2015.scipy.org.** SciPy 2015, the fourteenth annual Scientific Computing with Python conference, will be held July 6-12, 2015 in Austin, Texas. SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conference brings together over 500 participants from industry, academia, and government to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. The full program will consist of two days of tutorials by followed by three days of presentations, and concludes with two days of developer sprints. More info available on the conference website at http://scipy2015.scipy.org; you can also sign up on the website for mailing list updates or follow @scipyconf on Twitter. We hope you'll join us - early bird registration is open until May 15, 2015 at http://scipy2015.scipy.org We encourage you to submit tutorial or talk proposals in the categories below; please also share with others who you'd like to see participate! Submit via the conference website @ http://scipy2015.scipy.org. *SCIPY TUTORIAL SESSION PROPOSALS - DEADLINE EXTENDED TO WED APRIL 1, 2015* The SciPy experience kicks off with two days of tutorials. These sessions provide extremely affordable access to expert training, and consistently receive fantastic feedback from participants. We're looking for submissions on topics from introductory to advanced - we'll have attendees across the gamut looking to learn. Whether you are a major contributor to a scientific Python library or an expert-level user, this is a great opportunity to share your knowledge and stipends are available. Submit Your Tutorial Proposal on the SciPy 2015 website: http://scipy2015.scipy.org *SCIPY TALK AND POSTER SUBMISSIONS - DUE April 1, 2015* SciPy 2015 will include 3 major topic tracks and 7 mini-symposia tracks. Submit Your Talk Proposal on the SciPy 2015 website: http://scipy2015.scipy.org Major topic tracks include: - Scientific Computing in Python (General track) - Python in Data Science - Quantitative and Computational Social Sciences Mini-symposia will include the applications of Python in: - Astronomy and astrophysics - Computational life and medical sciences - Engineering - Geographic information systems (GIS) - Geophysics - Oceanography and meteorology - Visualization, vision and imaging If you have any questions or comments, feel free to contact us at: scipy-organizers at scipy.org. From marcospc6 at gmail.com Wed Mar 18 21:02:58 2015 From: marcospc6 at gmail.com (Marcos .) Date: Wed, 18 Mar 2015 22:02:58 -0300 Subject: [Numpy-discussion] Porting C to Python Message-ID: Dear colleagues, My name is Marcos Chaves, I'm an undergraduate student from Brazil. I study at the State University of Campinas, currently at the end of my graduation. I'm deeply interested in applying to GSOC this year, specifically to work with NumPy. About some months ago me and my friends at work started porting some legacy C/C++ code to Python, a code that we used to evaluate some images using OpenCV. We made use of NumPy and I liked it. I believe I can be of some help in making it better, particularly in solving the issue with porting parts of it from C (while maintaining optimal performance). I've never contributed to NumPy in any way and I see that its a requirement for me to apply for such task. I would like to know if there's a way I can participate, because the end of student registration is coming soon and I'm not sure if there is enough time for me to do something like that, but I'm interested. Some input from the potential mentor, Nathaniel Smith, along with some extra info on this task would be very nice for me. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From saprativejana at gmail.com Thu Mar 19 02:25:19 2015 From: saprativejana at gmail.com (Saprative Jana) Date: Thu, 19 Mar 2015 11:55:19 +0530 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc Message-ID: hi, I am Saprative .I am new to numpy devlopment. I want to work on the project of improving datetime functionality numpy project .I want to solve some related bugs and get started with the basics. As there is no irc channel for numpy so i am facing a problem of contacting with the mentors moreover there is no mentors mentioned for this project. So anybody who can help me out please contact with me. from, Saprative Jana (Mob: +919477325233) -------------- next part -------------- An HTML attachment was scrubbed... URL: From c99.smruti at gmail.com Thu Mar 19 06:15:06 2015 From: c99.smruti at gmail.com (SMRUTI RANJAN SAHOO) Date: Thu, 19 Mar 2015 15:45:06 +0530 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: i am also student developer. if i will get anything i will tell you. On Thu, Mar 19, 2015 at 11:55 AM, Saprative Jana wrote: > hi, > I am Saprative .I am new to numpy devlopment. I want to work on the > project of improving datetime functionality numpy project .I want to solve > some related bugs and get started with the basics. As there is no irc > channel for numpy so i am facing a problem of contacting with the mentors > moreover there is no mentors mentioned for this project. So anybody who can > help me out please contact with me. > from, > Saprative Jana > (Mob: +919477325233) > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgohlke at uci.edu Thu Mar 19 17:06:50 2015 From: cgohlke at uci.edu (Christoph Gohlke) Date: Thu, 19 Mar 2015 14:06:50 -0700 Subject: [Numpy-discussion] Problem with _dotblas.pyd when using Matplotlib for 3d plot In-Reply-To: References: Message-ID: <550B3A6A.1080200@uci.edu> On 11/19/2014 3:21 PM, Charles R Harris wrote: > > > On Wed, Nov 19, 2014 at 3:03 PM, M?gardon Geoffrey > > wrote: > > __ > Hi, > > In verbose mode, it stops at this line: > test_blasdot.test_dot_3args ... > > Ok, I will try with numpy and python :/ > > > Might want to file a bug report with Continuum Analytics. Plain old > numpy will not be using _dotblas, that requires ATLAS or MKL for blas > support. > > Chuck > The crash is likely due to a bug in Intel's MKL when running on AMD processors. It has been fixed in MKL 11.1 Update 4 and 11.2. Christoph From jianhong.wang at gmail.com Thu Mar 19 22:31:30 2015 From: jianhong.wang at gmail.com (Jianhong Wang) Date: Thu, 19 Mar 2015 22:31:30 -0400 Subject: [Numpy-discussion] how to optimize numpy code for Markovian path Message-ID: Below is a python function to generate Markov path (the travelling salesman problem). def generate_travel_path(markov_matrix, n): assert markov_matrix.shape[0] == markov_matrix.shape[1] assert n <= markov_matrix.shape[0] p = markov_matrix.copy() path = [0] * n for k in range(1, n): k1 = path[k-1] row_sums = 1 / (1 - p[:, k1]) p *= row_sums[:, np.newaxis] p[:, k1] = 0 path[k] = np.random.multinomial(1, p[k1, :]).argmax() assert len(set(path)) == n return path markov_matrix is a predefined Markov transition matrix. The code generates a path starting from node zero and visit every node once based on this matrix. However I feel the function is quite slow. Below is the line-by-line profile with a 53x53 markov_matrix: Timer unit: 3.49943e-07 s Total time: 0.00551195 sFile: Function: generate_travel_path at line 1Line # Hits Time Per Hit % Time Line Contents============================================================== 1 def generate_travel_path(markov_matrix, n): 2 1 31 31.0 0.2 assert markov_matrix.shape[0] == markov_matrix.shape[1] 3 1 12 12.0 0.1 assert n <= markov_matrix.shape[0] 4 5 1 99 99.0 0.6 p = markov_matrix.copy() 6 1 12 12.0 0.1 path = [0] * n 7 53 416 7.8 2.6 for k in range(1, n): 8 52 299 5.8 1.9 k1 = path[k-1] 9 52 3677 70.7 23.3 row_sums = 1 / (1 - p[:, k1]) 10 52 4811 92.5 30.5 p = p * row_sums[:, np.newaxis] 11 52 1449 27.9 9.2 p[:, k1] = 0 12 52 4890 94.0 31.0 path[k] = np.random.multinomial(1, p[k1, :]).argmax() 13 14 1 51 51.0 0.3 assert len(set(path)) == n 15 1 4 4.0 0.0 return path If I ran this function 25000 times, it will take me more than 125 seconds. Any headroom to improve the speed? Below is a simple function to generate a Markov matrix. def initial_trans_matrix(n): x = np.ones((n, n)) / (n - 1) np.fill_diagonal(x, 0.0) return x -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.tunedal at operamail.com Fri Mar 20 04:45:41 2015 From: per.tunedal at operamail.com (Per Tunedal) Date: Fri, 20 Mar 2015 09:45:41 +0100 Subject: [Numpy-discussion] Installation on Windows Message-ID: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> Hi, how do I install Numpy on Windows? I've tried the setup.py file, but get an error message: setup.py install gives: No module named msvccompiler in numpy.distutils; trying from distutils error: Unable to find vcvarsall.bat Yours, Per Tunedal From Jerome.Kieffer at esrf.fr Fri Mar 20 04:49:44 2015 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Fri, 20 Mar 2015 09:49:44 +0100 Subject: [Numpy-discussion] Installation on Windows In-Reply-To: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> References: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> Message-ID: <20150320094944.39a1297a@lintaillefer.esrf.fr> On Fri, 20 Mar 2015 09:45:41 +0100 Per Tunedal wrote: > Hi, > how do I install Numpy on Windows? I've tried the setup.py file, but get > an error message: > > setup.py install > > gives: > No module named msvccompiler in numpy.distutils; trying from distutils > error: Unable to find vcvarsall.bat > get the ms compiler from http://aka.ms/vcpython27 -- J?r?me Kieffer tel +33 476 882 445 From njs at pobox.com Fri Mar 20 05:06:43 2015 From: njs at pobox.com (Nathaniel Smith) Date: Fri, 20 Mar 2015 02:06:43 -0700 Subject: [Numpy-discussion] Installation on Windows In-Reply-To: <20150320094944.39a1297a@lintaillefer.esrf.fr> References: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> <20150320094944.39a1297a@lintaillefer.esrf.fr> Message-ID: On Mar 20, 2015 1:49 AM, "Jerome Kieffer" wrote: > > On Fri, 20 Mar 2015 09:45:41 +0100 > Per Tunedal wrote: > > > Hi, > > how do I install Numpy on Windows? I've tried the setup.py file, but get > > an error message: > > > > setup.py install > > > > gives: > > No module named msvccompiler in numpy.distutils; trying from distutils > > error: Unable to find vcvarsall.bat > > > > get the ms compiler from > http://aka.ms/vcpython27 Or more generally, you need to install the same version of MSVC or the "platform SDK" that was used to compile your python interpreter; this varies between python interpreters. The above link is appropriate for python 2.7 (and maybe some nearby versions too, I forget), but won't help if you're using e.g. python 3.4. If you want maximum speed you'll also need to acquire an optimized blas library from somewhere, which can be tricky. You might prefer to just download Christoph Gohlke's wheels (which are linked against the proprietary MKL, in case you care), or to use a scientific python distribution like anaconda or pythonxy. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebix at sebix.at Fri Mar 20 10:59:58 2015 From: sebix at sebix.at (Sebastian) Date: Fri, 20 Mar 2015 15:59:58 +0100 Subject: [Numpy-discussion] Installation on Windows In-Reply-To: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> References: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> Message-ID: <550C35EE.7090801@sebix.at> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, as you ask how to install Numpy and not how to compile it, I guess you are looking for a so called distribution. A distribution bundles pre-compiled packages of Numpy and others together for simple usage of Numpy. Otherwise you have to compile it yourself with various dependencies. That's easy to accomplish. Have a look at https://winpython.github.io/ https://code.google.com/p/pythonxy/ http://docs.continuum.io/anaconda/ regards, Sebastian On 03/20/2015 09:45 AM, Per Tunedal wrote: > Hi, > how do I install Numpy on Windows? I've tried the setup.py file, but get > an error message: > > setup.py install > > gives: > No module named msvccompiler in numpy.distutils; trying from distutils > error: Unable to find vcvarsall.bat > > Yours, > Per Tunedal > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- > python programming - mail server - photo - video - https://sebix.at > To verify my cryptographic signature or send me encrypted mails, get my > key at https://sebix.at/DC9B463B.asc and on public keyservers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCAAGBQJVDDXtAAoJEBn0X+vcm0Y7WeIP/An4PfdtfAQBMKPuUmFoLsfO mskvmdciJl7K7rGucvd1jJWGuuaarILziYjCQk7ZeWd/uvC8c7iA4H6T2PgA0CuP tsWfRpNNy56C7I6lo0b4l3l4o4QM84H/S9qKL5Qsnygl9BeFQxyAKspgwxWUmKXk 6V5YqCkF/91Qbeb8MTO6Gc4a8cG+H7xo1OEuOBC1qummU/f4UoaIwk1WXX3AeYaO Jun3ZNv6yB0mk94iQzIiccQmWz3T9F+Z0TawXg5otLgsCqpNd0GEtLV/MWmBU5HN zgQ7Uhmz9bmypSEx1UPF1L8NHOVD0VdoUCFy4tzECi7RqcVxxTJ1dwqZOFFQaqAk F6m3K4HTfvfhSaSZR9pIgtP0sVyis44R1Vox24IDZH6LKCpt6GnWcCxbZfCUQW67 9OEs/YP3yeH1VRY70soGmkexFc7a7ssy6nyuAN1MXSX+uxJDsr674gklqV1i8Yxm Et8hLDG084Bh7aaq4Xppz3kXNOLDX3+RClXJjOR0qyxzNqSdJBzgABmY83GDV2DS e7iV0IJYIBzBpU9tok3KRsYky/cKMkagx75MQKgWLqsmfSD+gutmEscgIKIJXCMx rt1NN46OODR9KMjoK+9k80GILEbU9gwsw61jrj0KaH+032tZemeMgN8GlkpTiTbW eomkdUii20Cjp3x+Jdvh =JGhA -----END PGP SIGNATURE----- From alimuldal at gmail.com Sat Mar 21 10:32:51 2015 From: alimuldal at gmail.com (Alistair Muldal) Date: Sat, 21 Mar 2015 14:32:51 +0000 Subject: [Numpy-discussion] Use of NameValidator in np.genfromtxt is inconsistent with the rules for naming structured array fields Message-ID: <550D8113.4060902@gmail.com> Hi all, I originally posted this to the issue tracker (https://github.com/numpy/numpy/issues/5686), and am posting here as well at the request of charris. Currently, np.genfromtxt uses a numpy.lib._iotools.NameValidator which mangles field names by replacing spaces and stripping out certain non-alphanumeric characters etc.: import numpy as np from io import BytesIO s = 'name,name with spaces,2*(x-1)!\n1,2,3\n4,5,6' x = np.genfromtxt(BytesIO(s), delimiter=',', names=True) print(repr(x)) # array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], # dtype=[('name', ' From celi at alum.mit.edu Mon Mar 23 01:09:00 2015 From: celi at alum.mit.edu (Lulu Li) Date: Mon, 23 Mar 2015 01:09:00 -0400 Subject: [Numpy-discussion] GSoC projects Message-ID: My apology if I am posting to the wrong mailing list. I am interested in NumPy project ideas for Google Summer of Code 2015 as posted here https://github.com/scipy/scipy/wiki/GSoC-project-ideas. In particular, knowing C and Python, I am interested in porting parts of bumpy from C to Cython or pythonic types. I wonder if these projects are still looking for participants? If not I will be excited to put together a proposal and work on them these summer. Lulu -------------- next part -------------- An HTML attachment was scrubbed... URL: From lepto.python at gmail.com Mon Mar 23 02:46:27 2015 From: lepto.python at gmail.com (oyster) Date: Mon, 23 Mar 2015 14:46:27 +0800 Subject: [Numpy-discussion] element-wise array segmental function operation? Message-ID: Hi, all I want to know wether there is a terse way to apply a function to every array element, where the function behaves according to the element value. for example [code] def fun(v): if 0<=v<60: return f1(v) #where f1 is a function elif 60<=v<70: return f2(v) elif 70<=v<80: return f3(v) ...and so on... [/code] for 'a=numpy.array([20,50,75])', I hope to get numpy.array([f1(20), f1(50), f3(75)]) thanks in advance Lee From nickpapior at gmail.com Mon Mar 23 02:49:43 2015 From: nickpapior at gmail.com (Nick Papior Andersen) Date: Mon, 23 Mar 2015 07:49:43 +0100 Subject: [Numpy-discussion] element-wise array segmental function operation? In-Reply-To: References: Message-ID: You can do: vfun = np.vectorize(fun) vfun([20,50,75]) that should work, note the abundant options available for denoting the vectorized arrays in "vectorize". Otherwise you could do nested where calls. 2015-03-23 7:46 GMT+01:00 oyster : > Hi, all > I want to know wether there is a terse way to apply a function to > every array element, where the function behaves according to the > element value. > for example > [code] > def fun(v): > if 0<=v<60: > return f1(v) #where f1 is a function > elif 60<=v<70: > return f2(v) > elif 70<=v<80: > return f3(v) > ...and so on... > [/code] > > for 'a=numpy.array([20,50,75])', I hope to get numpy.array([f1(20), > f1(50), f3(75)]) > > thanks in advance > > Lee > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Kind regards Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 03:12:24 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 08:12:24 +0100 Subject: [Numpy-discussion] GSoC projects In-Reply-To: References: Message-ID: Hi Lulu, welcome! On Mon, Mar 23, 2015 at 6:09 AM, Lulu Li wrote: > My apology if I am posting to the wrong mailing list. I am interested in > NumPy project ideas for Google Summer of Code 2015 as posted here > https://github.com/scipy/scipy/wiki/GSoC-project-ideas. In particular, > knowing C and Python, I am interested in porting parts of bumpy from C to > Cython or pythonic types. I wonder if these projects are still looking for > participants? If not I will be excited to put together a proposal and work > on them these summer. > Proposals are still very welcome. There has been some interest in this particular project idea, but I haven't seen any submitted proposals yet. And even if there were, you can still submit yours. The deadline is closing in fast, so you'll have to be quick though. Try to post a first draft asap, so you can get some feedback and improve your proposal before the 27th. Also keep in mind that one of the requirements for getting your proposal accepted is that you have submitted at least one patch to Numpy. This allows us to interact with you and gives you an idea of how the Numpy development process works. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From per.tunedal at operamail.com Mon Mar 23 03:38:54 2015 From: per.tunedal at operamail.com (Per Tunedal) Date: Mon, 23 Mar 2015 08:38:54 +0100 Subject: [Numpy-discussion] Installation on Windows In-Reply-To: <550C35EE.7090801@sebix.at> References: <1426841141.3132942.242906494.2401C9BA@webmail.messagingengine.com> <550C35EE.7090801@sebix.at> Message-ID: <1427096334.313112.243888582.00B54142@webmail.messagingengine.com> Hi, thank you all! This turned out more complicated than I expected. I tried installing the indicated compiler VCForPython27.msi but that didn't change anything. On the other hand I don't want to install any special distribution of Python - I want to stick to the standard distribution to be sure my own code can run anywhere. I only need numpy to test the language guesser langid.py - as I need a guesser for sentences (a very small amount of text, making the language identification tricky). langid.py happens to be dependent on numpy. I will try some other language guesser instead. I might install Anaconda on a virtual machine to compare numpy with other solutions. Yours, Per Tunedal On Fri, Mar 20, 2015, at 15:59, Sebastian wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Hi, > > as you ask how to install Numpy and not how to compile it, I guess you > are looking for a so called distribution. A distribution bundles > pre-compiled packages of Numpy and others together for simple usage of > Numpy. Otherwise you have to compile it yourself with various > dependencies. That's easy to accomplish. Have a look at > > https://winpython.github.io/ > https://code.google.com/p/pythonxy/ > http://docs.continuum.io/anaconda/ > > regards, > Sebastian > > On 03/20/2015 09:45 AM, Per Tunedal wrote: > > Hi, > > how do I install Numpy on Windows? I've tried the setup.py file, but get > > an error message: > > > > setup.py install > > > > gives: > > No module named msvccompiler in numpy.distutils; trying from distutils > > error: Unable to find vcvarsall.bat > > > > Yours, > > Per Tunedal > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at scipy.org > > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > -- > > python programming - mail server - photo - video - https://sebix.at > > To verify my cryptographic signature or send me encrypted mails, get my > > key at https://sebix.at/DC9B463B.asc and on public keyservers. > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQIcBAEBCAAGBQJVDDXtAAoJEBn0X+vcm0Y7WeIP/An4PfdtfAQBMKPuUmFoLsfO > mskvmdciJl7K7rGucvd1jJWGuuaarILziYjCQk7ZeWd/uvC8c7iA4H6T2PgA0CuP > tsWfRpNNy56C7I6lo0b4l3l4o4QM84H/S9qKL5Qsnygl9BeFQxyAKspgwxWUmKXk > 6V5YqCkF/91Qbeb8MTO6Gc4a8cG+H7xo1OEuOBC1qummU/f4UoaIwk1WXX3AeYaO > Jun3ZNv6yB0mk94iQzIiccQmWz3T9F+Z0TawXg5otLgsCqpNd0GEtLV/MWmBU5HN > zgQ7Uhmz9bmypSEx1UPF1L8NHOVD0VdoUCFy4tzECi7RqcVxxTJ1dwqZOFFQaqAk > F6m3K4HTfvfhSaSZR9pIgtP0sVyis44R1Vox24IDZH6LKCpt6GnWcCxbZfCUQW67 > 9OEs/YP3yeH1VRY70soGmkexFc7a7ssy6nyuAN1MXSX+uxJDsr674gklqV1i8Yxm > Et8hLDG084Bh7aaq4Xppz3kXNOLDX3+RClXJjOR0qyxzNqSdJBzgABmY83GDV2DS > e7iV0IJYIBzBpU9tok3KRsYky/cKMkagx75MQKgWLqsmfSD+gutmEscgIKIJXCMx > rt1NN46OODR9KMjoK+9k80GILEbU9gwsw61jrj0KaH+032tZemeMgN8GlkpTiTbW > eomkdUii20Cjp3x+Jdvh > =JGhA > -----END PGP SIGNATURE----- > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion From jtaylor.debian at googlemail.com Mon Mar 23 04:24:52 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Mon, 23 Mar 2015 09:24:52 +0100 Subject: [Numpy-discussion] element-wise array segmental function operation? In-Reply-To: References: Message-ID: <550FCDD4.9090301@googlemail.com> On 23.03.2015 07:46, oyster wrote: > Hi, all > I want to know wether there is a terse way to apply a function to > every array element, where the function behaves according to the > element value. > for example > [code] > def fun(v): > if 0<=v<60: > return f1(v) #where f1 is a function > elif 60<=v<70: > return f2(v) > elif 70<=v<80: > return f3(v) > ...and so on... > [/code] > > for 'a=numpy.array([20,50,75])', I hope to get numpy.array([f1(20), > f1(50), f3(75)]) > piecewise should be what you are looking for: http://docs.scipy.org/doc/numpy/reference/generated/numpy.piecewise.html From jeffreback at gmail.com Mon Mar 23 06:11:01 2015 From: jeffreback at gmail.com (Jeff Reback) Date: Mon, 23 Mar 2015 06:11:01 -0400 Subject: [Numpy-discussion] ANN: pandas 0.16.0 released Message-ID: Hello, We are proud to announce v0.16.0 of pandas, a major release from 0.15.2. This release includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. This was 4 months of work by 60 authors encompassing 204 issues. We recommend that all users upgrade to this version. *Highlights:* - - *DataFrame.assign* method, see here - *Series.to_coo/from_coo* methods to interact with *scipy.sparse*, see here - Backwards incompatible change to *Timedelta* to conform the *.seconds* attribute with *datetime.timedelta*, see here - Changes to the *.loc* slicing API to conform with the behavior of *.ix* see here - Changes to the default for ordering in the *Categorical* constructor, see here - Enhancement to the *.str* accessor to make string operations easier, see here - The *pandas.tools.rplot*, *pandas.sandbox.qtpandas* and *pandas.rpy* modules are deprecated. - We refer users to external packages like seaborn , pandas-qt and rpy2 for similar or equivalent functionality, see here for more detail See a full description of the Whatsnew for v0.16.0 *What is it:* *pandas* is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. Documentation: http://pandas.pydata.org/pandas-docs/stable/ Source tarballs, windows wheels, macosx wheels are available on PyPI: https://pypi.python.org/pypi/pandas windows binaries are courtesy of Christoph Gohlke and are built on Numpy 1.9 macosx wheels are courtesy of Matthew Brett and are built on Numpy 1.7.1 Please report any issues here: https://github.com/pydata/pandas/issues Thanks The Pandas Development Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From cengoguzhanunlu at gmail.com Mon Mar 23 07:23:24 2015 From: cengoguzhanunlu at gmail.com (=?UTF-8?B?T8SfdXpoYW4gw5xubMO8?=) Date: Mon, 23 Mar 2015 13:23:24 +0200 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 Message-ID: Hi, My name is O?uzhan(You may use 'Oguzhan'). I submitted a proposal on the system with the title 'NumPy - Vector math library integration'. Ralf commented on my proposal and advised to ask for a feedback on mailing list and here I am. I would appreciate any feedback from community. I think community members are able to view my proposal, its visibility is set to 'Organization members'. I preferred my name in its original form, if any mentor would like to search, I provide my name on system below. Name: O?uzhan ?nl? Thanks in advance, -------------- next part -------------- An HTML attachment was scrubbed... URL: From var.mail.daniel at gmail.com Mon Mar 23 09:59:44 2015 From: var.mail.daniel at gmail.com (Daniel da Silva) Date: Mon, 23 Mar 2015 09:59:44 -0400 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma package? On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon wrote: > Hi, > > It sounds like putting together a PR makes sense then. I'll try hacking on > this a bit. > > -Robert > On Mar 16, 2015 11:20 AM, "Jaime Fern?ndez del R?o" > wrote: > >> On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer >> wrote: >> >>> On Mon, 16 Mar 2015 06:56:58 -0700 >>> Jaime Fern?ndez del R?o wrote: >>> >>> > Dispatching to a different method seems like a no brainer indeed. The >>> > question is whether we really need to do this in C. >>> >>> I need to do both unweighted & weighted histograms and we got a factor 5 >>> using (simple) cython: >>> it is in the proceedings of Euroscipy, last year. >>> http://arxiv.org/pdf/1412.6367.pdf >> >> >> If I read your paper and code properly, you got 5x faster, mostly because >> you combined the weighted and unweighted histograms into a single search of >> the array, and because you used an algorithm that can only be applied to >> equal- sized bins, similarly to the 10x speed-up Robert was reporting. >> >> I think that having a special path for equal sized bins is a great idea: >> let's do it, PRs are always welcome! >> Similarly, getting the counts together with the weights seems like a very >> good idea. >> >> I also think that writing it in Python is going to take us 80% of the way >> there: most of the improvements both of you have reported are not likely to >> be coming from the language chosen, but from the algorithm used. And if C >> proves to be sufficiently faster to warrant using it, it should be confined >> to the number crunching: I don;t think there is any point in rewriting >> argument parsing in C. >> >> Also, keep in mind `np.histogram` can now handle arrays of just about >> **any** dtype. Handling that complexity in C is not a ride in the park. >> Other functions like `np.bincount` and `np.digitize` cheat by only handling >> `double` typed arrays, a luxury that histogram probably can't afford at >> this point in time. >> >> Jaime >> >> -- >> (\__/) >> ( O.o) >> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes >> de dominaci?n mundial. >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From opossumnano at gmail.com Mon Mar 23 11:44:38 2015 From: opossumnano at gmail.com (Tiziano Zito) Date: Mon, 23 Mar 2015 16:44:38 +0100 Subject: [Numpy-discussion] Reminder - Summer School "Advanced Scientific Programming in Python" in Munich, Germany Message-ID: <20150323154437.GA17798@eniac> Reminder: Deadline for application is 23:59 UTC, March 31, 2015. Advanced Scientific Programming in Python ========================================= a Summer School by the G-Node, the Bernstein Center for Computational Neuroscience Munich and the Graduate School of Systemic Neurosciences Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists have been trained to use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game. We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist. This school is targeted at Master or PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory materials before the course. Date and Location ================= August 31?September 5, 2015. Munich, Germany. Preliminary Program =================== Day 0 (Mon Aug 31) ? Best Programming Practices ? Best Practices for Scientific Computing ? Version control with git and how to contribute to Open Source with github ? Object-oriented programming & design patterns Day 1 (Tue Sept 1) ? Software Carpentry ? Test-driven development, unit testing & quality assurance ? Debugging, profiling and benchmarking techniques ? Advanced Python: generators, decorators, and context managers Day 2 (Wed Sept 2) ? Scientific Tools for Python ? Advanced NumPy ? The Quest for Speed (intro): Interfacing to C with Cython ? Contributing to Open Source Software/Programming in teams Day 3 (Thu Sept 3) ? The Quest for Speed ? Writing parallel applications in Python ? Python 3: why should I care ? Programming project Day 4 (Fri Sept 4) ? Efficient Memory Management ? When parallelization does not help: the starving CPUs problem ? Programming project Day 5 (Sat Sept 5) ? Practical Software Development ? Programming project ? The Pelita Tournament Every evening we will have the tutors' consultation hour: Tutors will answer your questions and give suggestions for your own projects. Applications ============ You can apply on-line at https://python.g-node.org Applications must be submitted before 23:59 UTC, March 31, 2015. Notifications of acceptance will be sent by May 1, 2015. No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate is usually around 20%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures Preliminary Faculty =================== ? Pietro Berkes, Enthought Inc., UK ? Marianne Corvellec, Plotly Technologies Inc., Montr?al, Canada ? Kathryn D. Huff, Department of Nuclear Engineering, University of California - Berkeley, USA ? Zbigniew J?drzejewski-Szmek, Krasnow Institute, George Mason University, USA ? Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de Lausanne, Switzerland ? Juan Nunez-Iglesias, Victorian Life Sciences Computation Initiative, University of Melbourne, Australia ? Rike-Benjamin Schuppner, Institute for Theoretical Biology, Humboldt-Universit?t zu Berlin, Germany ? Bartosz Tele?czuk, European Institute for Theoretical Neuroscience, CNRS, Paris, France ? Nelle Varoquaux, Centre for Computational Biology Mines ParisTech, Institut Curie, U900 INSERM, Paris, France ? Tiziano Zito, Forschungszentrum J?lich GmbH, Germany Organized by Tiziano Zito (head) and Zbigniew J?drzejewski-Szmek for the German Neuroinformatics Node of the INCF Germany, Christopher Roppelt for the German Center for Vertigo and Balance Disorders (DSGZ) and the Graduate School of Systemic Neurosciences (GSN) of the Ludwig-Maximilians-Universit?t Munich Germany, Christoph Hartmann for the Frankfurt Institute for Advanced Studies (FIAS) and International Max Planck Research School (IMPRS) for Neural Circuits, Frankfurt Germany, and Jakob Jordan for the Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6), J?lich Research Centre and JARA. Additional funding provided by the Bernstein Center for Computational Neuroscience (BCCN) Munich. Website: https://python.g-node.org Contact: python-info at g-node.org From ralf.gommers at gmail.com Mon Mar 23 13:36:14 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 18:36:14 +0100 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: On Mon, Mar 23, 2015 at 2:59 PM, Daniel da Silva wrote: > Hope this isn't too off-topic: but it would be very nice if np.histogram > and np.histogram2d supported masked arrays. Is this out of scope for > outside the numpy.ma package? > Right now it looks like there's no histogram function at all for masked arrays - would be good to improve that situation. If it's as easy as adding to np.histogram something like: if isinstance(a, np.ma.MaskedArray): a = a.data[~a.mask] then it makes sense to add that I think. Ralf > On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon > wrote: > >> Hi, >> >> It sounds like putting together a PR makes sense then. I'll try hacking >> on this a bit. >> >> -Robert >> On Mar 16, 2015 11:20 AM, "Jaime Fern?ndez del R?o" >> wrote: >> >>> On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer >>> wrote: >>> >>>> On Mon, 16 Mar 2015 06:56:58 -0700 >>>> Jaime Fern?ndez del R?o wrote: >>>> >>>> > Dispatching to a different method seems like a no brainer indeed. The >>>> > question is whether we really need to do this in C. >>>> >>>> I need to do both unweighted & weighted histograms and we got a factor >>>> 5 using (simple) cython: >>>> it is in the proceedings of Euroscipy, last year. >>>> http://arxiv.org/pdf/1412.6367.pdf >>> >>> >>> If I read your paper and code properly, you got 5x faster, mostly >>> because you combined the weighted and unweighted histograms into a single >>> search of the array, and because you used an algorithm that can only be >>> applied to equal- sized bins, similarly to the 10x speed-up Robert was >>> reporting. >>> >>> I think that having a special path for equal sized bins is a great idea: >>> let's do it, PRs are always welcome! >>> Similarly, getting the counts together with the weights seems like a >>> very good idea. >>> >>> I also think that writing it in Python is going to take us 80% of the >>> way there: most of the improvements both of you have reported are not >>> likely to be coming from the language chosen, but from the algorithm used. >>> And if C proves to be sufficiently faster to warrant using it, it should be >>> confined to the number crunching: I don;t think there is any point in >>> rewriting argument parsing in C. >>> >>> Also, keep in mind `np.histogram` can now handle arrays of just about >>> **any** dtype. Handling that complexity in C is not a ride in the park. >>> Other functions like `np.bincount` and `np.digitize` cheat by only handling >>> `double` typed arrays, a luxury that histogram probably can't afford at >>> this point in time. >>> >>> Jaime >>> >>> -- >>> (\__/) >>> ( O.o) >>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus >>> planes de dominaci?n mundial. >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 13:56:21 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 18:56:21 +0100 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 12:23 PM, O?uzhan ?nl? wrote: > Hi, > > My name is O?uzhan(You may use 'Oguzhan'). I submitted a proposal on the > system with the title 'NumPy - Vector math library integration'. Ralf > commented on my proposal and advised to ask for a feedback on mailing list > and here I am. > > I would appreciate any feedback from community. I think community members > are able to view my proposal, its visibility is set to 'Organization > members'. > > I preferred my name in its original form, if any mentor would like to > search, I provide my name on system below. > Name: O?uzhan ?nl? > Hi O?uzhan, There are only a handful of potential mentors signed up in Melange, and this list is read by hundreds of people. So it would be good to post your proposal in a publicly accessible place and post the link here. Good options are on Github or on StackEdit. Cheers, Ralf P.S. for those who do have access to Melange: http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/blacksimit/5741031244955648 -------------- next part -------------- An HTML attachment was scrubbed... URL: From efiring at hawaii.edu Mon Mar 23 14:38:29 2015 From: efiring at hawaii.edu (Eric Firing) Date: Mon, 23 Mar 2015 08:38:29 -1000 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: <55105DA5.6060106@hawaii.edu> On 2015/03/23 7:36 AM, Ralf Gommers wrote: > > > On Mon, Mar 23, 2015 at 2:59 PM, Daniel da Silva > > wrote: > > Hope this isn't too off-topic: but it would be very nice if > np.histogram and np.histogram2d supported masked arrays. Is this out > of scope for outside the numpy.ma package? > > > Right now it looks like there's no histogram function at all for masked > arrays - would be good to improve that situation. > > If it's as easy as adding to np.histogram something like: > > if isinstance(a, np.ma.MaskedArray): > a = a.data[~a.mask] It looks like it requires a little more than that, but not much. For full support a new mask would need to be made from the logical_or of the "a" mask and the weights mask, and then used to compress both "a" and weights. Eric > > then it makes sense to add that I think. > > Ralf > > > > On Mon, Mar 16, 2015 at 2:35 PM, Robert McGibbon > wrote: > > Hi, > > It sounds like putting together a PR makes sense then. I'll try > hacking on this a bit. > > -Robert > > On Mar 16, 2015 11:20 AM, "Jaime Fern?ndez del R?o" > > wrote: > > On Mon, Mar 16, 2015 at 9:28 AM, Jerome Kieffer > > wrote: > > On Mon, 16 Mar 2015 06:56:58 -0700 > Jaime Fern?ndez del R?o > wrote: > > > Dispatching to a different method seems like a no brainer indeed. The > > question is whether we really need to do this in C. > > I need to do both unweighted & weighted histograms and > we got a factor 5 using (simple) cython: > it is in the proceedings of Euroscipy, last year. > http://arxiv.org/pdf/1412.6367.pdf > > > If I read your paper and code properly, you got 5x faster, > mostly because you combined the weighted and unweighted > histograms into a single search of the array, and because > you used an algorithm that can only be applied to equal- > sized bins, similarly to the 10x speed-up Robert was reporting. > > I think that having a special path for equal sized bins is a > great idea: let's do it, PRs are always welcome! > Similarly, getting the counts together with the weights > seems like a very good idea. > > I also think that writing it in Python is going to take us > 80% of the way there: most of the improvements both of you > have reported are not likely to be coming from the language > chosen, but from the algorithm used. And if C proves to be > sufficiently faster to warrant using it, it should be > confined to the number crunching: I don;t think there is any > point in rewriting argument parsing in C. > > Also, keep in mind `np.histogram` can now handle arrays of > just about **any** dtype. Handling that complexity in C is > not a ride in the park. Other functions like `np.bincount` > and `np.digitize` cheat by only handling `double` typed > arrays, a luxury that histogram probably can't afford at > this point in time. > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale > en sus planes de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From njs at pobox.com Mon Mar 23 14:57:01 2015 From: njs at pobox.com (Nathaniel Smith) Date: Mon, 23 Mar 2015 11:57:01 -0700 Subject: [Numpy-discussion] Rewrite np.histogram in c? In-Reply-To: References: <20150316172848.421a2f6f@lintaillefer.esrf.fr> Message-ID: On Mar 23, 2015 6:59 AM, "Daniel da Silva" wrote: > > Hope this isn't too off-topic: but it would be very nice if np.histogram and np.histogram2d supported masked arrays. Is this out of scope for outside the numpy.ma package? Usually the way this kind of thing is handled is by adding an np.ma.histogram function. -n -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 17:21:44 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 22:21:44 +0100 Subject: [Numpy-discussion] GSoC students: please read Message-ID: Hi all, It's great to see that this year there are a lot of students interested in doing a GSoC project with Numpy or Scipy. So far five proposals have been submitted, and it looks like several more are being prepared now. I'd like to give you a bit of advice as well as an idea of what's going to happen in the few weeks. The deadline for submitting applications is 27 March. Don't wait until the last day to submit your proposal! It has happened before that Melange was overloaded and unavailable - the Google program admins will not accept that as an excuse and allow you to submit later. So as soon as your proposal is in good shape, put it in. You can still continue revising it. >From 28 March until 13 April we will continue to interact with you, as we request slots from the PSF and rank the proposals. We don't know how many slots we will get this year, but to give you an impression: for the last two years we got 2 slots. Hopefully we can get more this year, but that's far from certain. Our ranking will be based on a combination of factors: the interaction you've had with potential mentors and the community until now (and continue to have), the quality of your submitted PRs, quality and projected impact of your proposal, your enthusiasm, match with potential mentors, etc. We will also organize a video call (Skype / Google Hangout / ...) with each of you during the first half of April to be able to exchange ideas with a higher communication bandwidth medium than email. Finally a note on mentoring: we will be able to mentor all proposals submitted or suggested until now. Due to the large interest and technical nature of a few topics it has in some cases taken a bit long to provide feedback on draft proposals, however there are no showstoppers in this regard. Please continue improving your proposals and working with your potential mentors. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Mon Mar 23 17:29:56 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Mon, 23 Mar 2015 14:29:56 -0700 Subject: [Numpy-discussion] GSoC students: please read In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers wrote: > It's great to see that this year there are a lot of students interested in > doing a GSoC project with Numpy or Scipy. So far five proposals have been > submitted, and it looks like several more are being prepared now. > Hi Ralf, Is there a centralized place for non-mentors to view proposals and give feedback? Thanks, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Mon Mar 23 17:42:34 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 23 Mar 2015 22:42:34 +0100 Subject: [Numpy-discussion] GSoC students: please read In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 10:29 PM, Stephan Hoyer wrote: > On Mon, Mar 23, 2015 at 2:21 PM, Ralf Gommers > wrote: > >> It's great to see that this year there are a lot of students interested >> in doing a GSoC project with Numpy or Scipy. So far five proposals have >> been submitted, and it looks like several more are being prepared now. >> > > Hi Ralf, > > Is there a centralized place for non-mentors to view proposals and give > feedback? > Hi Stephan, there isn't really. All students post their drafts to the mailing list, where they can get feedback. They're free to keep that draft wherever they want - blogs, Github, StackEdit, ftp sites and more are all being used. The central overview is in Melange (the official GSoC tool), but that's not publicly accessible. Note that an overview of project ideas can be found at https://github.com/scipy/scipy/wiki/GSoC-project-ideas. If you're particularly interested in one or more of those, it should be easy to find back in the mailing list archive what students sent draft proposals for feedback. Your comments on individual proposals will be much appreciated. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.p.krauss at gmail.com Mon Mar 23 18:04:18 2015 From: thomas.p.krauss at gmail.com (Tom Krauss) Date: Mon, 23 Mar 2015 17:04:18 -0500 Subject: [Numpy-discussion] numpy.i: passing in in-place ND array as "flat" 1D array Message-ID: I have a method on a C++ object that treats all elements the same and modifies the array in-place (quantizes each value). Usually I just have a vector, i.e. a 1D array. But today I wanted to quantize a 2D array, and the (DATA_TYPE* INPLACE_ARRAY1, DIM_TYPE DIM1) failed to do the trick, because of the require_dimensions(array, 1) call. So I created a new typemap in numpy.i (DATA_TYPE* INPLACE_ARRAY_FLAT, DIM_TYPE DIM_FLAT) that omits the call to "require_dimensions", and behold! It works for both 1D and 2D (and really any D). Does this seem like a reasonable addition to numpy.i? Or is there another way to do this that I am missing? Regards, Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From cengoguzhanunlu at gmail.com Mon Mar 23 19:16:40 2015 From: cengoguzhanunlu at gmail.com (=?UTF-8?B?T8SfdXpoYW4gw5xubMO8?=) Date: Tue, 24 Mar 2015 01:16:40 +0200 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 Message-ID: Hi again, Thanks Ralf, I understand that. Then, I would like to share a public link to my proposal and I appreciate anybody who take time and leave comment/give feedback. It is on Github Gist. URL: https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 Thanks in advance, Oguzhan > Hi, > > > > My name is O?uzhan(You may use 'Oguzhan'). I submitted a proposal on the > > system with the title 'NumPy - Vector math library integration'. Ralf > > commented on my proposal and advised to ask for a feedback on mailing > list > > and here I am. > > > > I would appreciate any feedback from community. I think community members > > are able to view my proposal, its visibility is set to 'Organization > > members'. > > > > I preferred my name in its original form, if any mentor would like to > > search, I provide my name on system below. > > Name: O?uzhan ?nl? > > > > Hi O?uzhan, > > There are only a handful of potential mentors signed up in Melange, and > this list is read by hundreds of people. So it would be good to post your > proposal in a publicly accessible place and post the link here. Good > options are on Github or on StackEdit. > > Cheers, > Ralf > > P.S. for those who do have access to Melange: > > http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/blacksimit/5741031244955648 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From akrock930701 at gmail.com Mon Mar 23 20:36:05 2015 From: akrock930701 at gmail.com (=?UTF-8?B?0J7RgNC40L/QvtCyINCQ0LrQsdCw0YA=?=) Date: Tue, 24 Mar 2015 01:36:05 +0100 Subject: [Numpy-discussion] Vector math library integration Message-ID: Hello! I want to contribute to NumPy/SciPy, namely I am interested in project Vector math library integration. I have good skills of C and Python, so I can make it. Please, send me additional information about this idea asap. Have a nice day! Best regards, Akbar IRC: aki93 at freenode dot net -------------- next part -------------- An HTML attachment was scrubbed... URL: From n59_ru at hotmail.com Tue Mar 24 04:40:33 2015 From: n59_ru at hotmail.com (Nikolay Mayorov) Date: Tue, 24 Mar 2015 13:40:33 +0500 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 In-Reply-To: References: Message-ID: Hi, Oguzhan. I suggest to add .md extension to the gist file, now it is displayed as raw text. Date: Tue, 24 Mar 2015 01:16:40 +0200 From: cengoguzhanunlu at gmail.com To: numpy-discussion at scipy.org Subject: Re: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 Hi again, Thanks Ralf, I understand that. Then, I would like to share a public link to my proposal and I appreciate anybody who take time and leave comment/give feedback. It is on Github Gist. URL: https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 Thanks in advance, Oguzhan > Hi, > > My name is O?uzhan(You may use 'Oguzhan'). I submitted a proposal on the > system with the title 'NumPy - Vector math library integration'. Ralf > commented on my proposal and advised to ask for a feedback on mailing list > and here I am. > > I would appreciate any feedback from community. I think community members > are able to view my proposal, its visibility is set to 'Organization > members'. > > I preferred my name in its original form, if any mentor would like to > search, I provide my name on system below. > Name: O?uzhan ?nl? > Hi O?uzhan, There are only a handful of potential mentors signed up in Melange, and this list is read by hundreds of people. So it would be good to post your proposal in a publicly accessible place and post the link here. Good options are on Github or on StackEdit. Cheers, Ralf P.S. for those who do have access to Melange: http://www.google-melange.com/gsoc/proposal/review/org/google/gsoc2015/blacksimit/5741031244955648 _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From gregor.thalhammer at gmail.com Tue Mar 24 05:32:04 2015 From: gregor.thalhammer at gmail.com (Gregor Thalhammer) Date: Tue, 24 Mar 2015 10:32:04 +0100 Subject: [Numpy-discussion] Introductory mail and GSoc Project "Vector math library integration" In-Reply-To: <55018B06.7050804@googlemail.com> References: <0C44B0F4-D0F6-474A-AD0D-E2A486165691@gmail.com> <55018B06.7050804@googlemail.com> Message-ID: <63332791-3B02-4856-A9A9-7A7D03394E9D@gmail.com> > Am 12.03.2015 um 13:48 schrieb Julian Taylor : > > On 03/12/2015 10:15 AM, Gregor Thalhammer wrote: >> >> Another note, numpy makes it easy to provide new ufuncs, see >> http://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-tutorial.html >> from a C function that operates on 1D arrays, but this function needs to >> support arbitrary spacing (stride) between the items. Unfortunately, to >> achieve good performance, vector math libraries often expect that the >> items are laid out contiguously in memory. MKL/VML is a notable >> exception. So for non contiguous in- or output arrays you might need to >> copy the data to a buffer, which likely kills large amounts of the >> performance gain. > > The elementary functions are very slow even compared to memory access, > they take in the orders of hundreds to tens of thousand cycles to > complete (depending on range and required accuracy). > Even in the case of strided access that gives the hardware prefetchers > plenty of time to load the data before the previous computation is done. > That might apply to the mathematical functions from the standard libraries, but that is not true for the optimized libraries. Typical numbers are 4-10 CPU cycles per operation, see e.g. https://software.intel.com/sites/products/documentation/doclib/mkl_sa/112/vml/functions/_performanceall.html The benchmarks at https://github.com/geggo/uvml show that memory access to main memory limits the performance for the calculation of exp for large array sizes . This test was done quite some time ago, memory bandwidth now typically is higher, but also computational power. > This also removes the requirement from the library to provide a strided > api, we can copy the strided data into a contiguous buffer and pass it > to the library without losing much performance. It may not be optimal > (e.g. a library can fine tune the prefetching better for the case where > the hardware is not ideal) but most likely sufficient. Copying the data to a small enough buffer so it fits into cache might add a few cycles, this already impacts performance significantly. Curious to see how much. Gregor > > Figuring out how to best do it to get the best performance and still > being flexible in what implementation is used is part of the challenge > the student will face for this project. > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From questions.anon at gmail.com Tue Mar 24 06:02:31 2015 From: questions.anon at gmail.com (questions anon) Date: Tue, 24 Mar 2015 21:02:31 +1100 Subject: [Numpy-discussion] netcdf lat lon to coord - ValueError: need more than 1 value to unpack Message-ID: I would like to find the nearest coord in a netcdf from a given latitude and longitude. I found some fantastic code that does this - http://nbviewer.ipython.org/github/Unidata/unidata-python-workshop/blob/master/netcdf-by-coordinates.ipynb but I keep receiving this error - I am receiving a ValueError: need more than 1 value to unpack I have pasted the code and full error below. Any help will be greatly appreciated. import numpy as np import netCDF4 def naive_fast(latvar,lonvar,lat0,lon0): # Read latitude and longitude from file into numpy arrays latvals = latvar[:] lonvals = lonvar[:] ny,nx = latvals.shape dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 minindex_flattened = dist_sq.argmin() # 1D index of min element iy_min,ix_min = np.unravel_index(minindex_flattened, latvals.shape) return iy_min,ix_min filename = "/Users/T_SFC.nc" ncfile = netCDF4.Dataset(filename, 'r') latvar = ncfile.variables['latitude'] lonvar = ncfile.variables['longitude'] iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] ncfile.close() --------------------------------------------------------------------------- ValueError Traceback (most recent call last) /Applications/Canopy.app/appdata/canopy-1.3.0.1715.macosx-x86_64/Canopy.app/Contents/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where) 202 else: 203 filename = fname --> 204 __builtin__.execfile(filename, *where) /Users/latlon_to_closestgrid.py in () 22 lonvar = ncfile.variables['longitude'] 23 ---> 24 iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) 25 print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] 26 ncfile.close() /Users/latlon_to_closestgrid.py in naive_fast(latvar, lonvar, lat0, lon0) 12 latvals = latvar[:] 13 lonvals = lonvar[:] ---> 14 ny,nx = latvals.shape 15 dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 16 minindex_flattened = dist_sq.argmin() # 1D index of min element ValueError: need more than 1 value to unpack -------------- next part -------------- An HTML attachment was scrubbed... URL: From kikocorreoso at gmail.com Tue Mar 24 06:14:01 2015 From: kikocorreoso at gmail.com (Kiko) Date: Tue, 24 Mar 2015 11:14:01 +0100 Subject: [Numpy-discussion] netcdf lat lon to coord - ValueError: need more than 1 value to unpack In-Reply-To: References: Message-ID: 2015-03-24 11:02 GMT+01:00 questions anon : > I would like to find the nearest coord in a netcdf from a given latitude > and longitude. > I found some fantastic code that does this - > http://nbviewer.ipython.org/github/Unidata/unidata-python-workshop/blob/master/netcdf-by-coordinates.ipynb > but I keep receiving this error - I am receiving a ValueError: need more > than 1 value to unpack > > I have pasted the code and full error below. Any help will be greatly > appreciated. > > > > > import numpy as np > > import netCDF4 > > > def naive_fast(latvar,lonvar,lat0,lon0): > > # Read latitude and longitude from file into numpy arrays > > latvals = latvar[:] > > lonvals = lonvar[:] > > ny,nx = latvals.shape > > dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 > > minindex_flattened = dist_sq.argmin() # 1D index of min element > > iy_min,ix_min = np.unravel_index(minindex_flattened, latvals.shape) > > return iy_min,ix_min > > filename = "/Users/T_SFC.nc" > > ncfile = netCDF4.Dataset(filename, 'r') > > latvar = ncfile.variables['latitude'] > > lonvar = ncfile.variables['longitude'] > > > iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) > > print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] > > ncfile.close() > > > > > > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > /Applications/Canopy.app/appdata/canopy-1.3.0.1715.macosx-x86_64/Canopy.app/Contents/lib/python2.7/site-packages/IPython/utils/py3compat.pyc > in execfile(fname, *where) > 202 else: > 203 filename = fname > --> 204 __builtin__.execfile(filename, *where) > > /Users/latlon_to_closestgrid.py in () > 22 lonvar = ncfile.variables['longitude'] > 23 > ---> 24 iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) > 25 print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] > 26 ncfile.close() > > /Users/latlon_to_closestgrid.py in naive_fast(latvar, lonvar, lat0, lon0) > 12 latvals = latvar[:] > 13 lonvals = lonvar[:] > ---> 14 ny,nx = latvals.shape > 15 dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 > 16 minindex_flattened = dist_sq.argmin() # 1D index of min > element > > ValueError: need more than 1 value to unpack > > > It seems that latvals and lonvals should be a 2D array and you are providing just a 1D array. Maybe you could use numpy.meshgrid [1] to get 2D inputs from 1D arrays. [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nicolas.Rougier at inria.fr Tue Mar 24 07:36:24 2015 From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier) Date: Tue, 24 Mar 2015 12:36:24 +0100 Subject: [Numpy-discussion] EuroScipy 2015: Call for talks, posters & tutorials Message-ID: <24A72082-A726-4BAC-B8FD-F0771A4C4782@inria.fr> Dear all, EuroScipy 2015, the annual conference on Python in science will take place in Cambridge, UK on 26-30 August 2015. The conference features two days of tutorials followed by two days of scientific talks & posters and an extra day dedicated to developer sprints. It is the major event in Europe in the field of technical/scientific computing within the Python ecosystem. Data scientists, analysts, quants, PhD's, scientists and students from more than 20 countries attended the conference last year. The topics presented at EuroSciPy are very diverse, with a focus on advanced software engineering and original uses of Python and its scientific libraries, either in theoretical or experimental research, from both academia and the industry. Submissions for posters, talks & tutorials (beginner and advanced) are welcome on our website at http://www.euroscipy.org/2015/ Sprint proposals should be addressed directly to the organisation at euroscipy-org at python.org Important dates Mar 24, 2015 Call for talks, posters & tutorials Apr 30, 2015 Talk and tutorials submission deadline May 1, 2015 Registration opens May 30, 2015 Final program announced Jun 15, 2015 Early-bird registration ends Aug 26-27, 2015 Tutorials Aug 28-29, 2015 Main conference Aug 30, 2015 Sprints We look forward to an exciting conference and hope to see you in Cambridge The EuroSciPy 2015 Team - http://www.euroscipy.org/2015/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cengoguzhanunlu at gmail.com Tue Mar 24 08:12:00 2015 From: cengoguzhanunlu at gmail.com (=?UTF-8?B?T8SfdXpoYW4gw5xubMO8?=) Date: Tue, 24 Mar 2015 14:12:00 +0200 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 Message-ID: Hi Nikolay, Thanks for pointing out that! It really helped. I think it looks better and easier to review now. I appreciate any comment/feedback. My proposal is at https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 Thanks in advance, Oguzhan Hi, Oguzhan. I suggest to add .md extension to the gist file, now it is > displayed as raw text. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From albzey at googlemail.com Tue Mar 24 08:29:43 2015 From: albzey at googlemail.com (Albert Zeyer) Date: Tue, 24 Mar 2015 13:29:43 +0100 Subject: [Numpy-discussion] Hang in numpy.zeros Message-ID: Hi, I have a multithreaded multiprocessing Python 2.7 project with Theano. In my main proc, I recognized a hang in numpy.zeros. It deadlocks with the Python GIL and other native threads will lock on the Python GIL. I found a somewhat related problem described here about such a deadlock in numpy.dot: http://stackoverflow.com/questions/23963997/python-child-process-crashes-on-numpy-dot-if-pyside-is-imported I have Numpy 1.9.1 on Ubuntu 12.04. How can I fix this? Is this a known problem? Kind Regards, Albert From questions.anon at gmail.com Tue Mar 24 16:10:00 2015 From: questions.anon at gmail.com (questions anon) Date: Wed, 25 Mar 2015 07:10:00 +1100 Subject: [Numpy-discussion] netcdf lat lon to coord - ValueError: need more than 1 value to unpack In-Reply-To: References: Message-ID: perfect, thank you! On Tue, Mar 24, 2015 at 9:14 PM, Kiko wrote: > > > 2015-03-24 11:02 GMT+01:00 questions anon : > >> I would like to find the nearest coord in a netcdf from a given latitude >> and longitude. >> I found some fantastic code that does this - >> http://nbviewer.ipython.org/github/Unidata/unidata-python-workshop/blob/master/netcdf-by-coordinates.ipynb >> but I keep receiving this error - I am receiving a ValueError: need more >> than 1 value to unpack >> >> I have pasted the code and full error below. Any help will be greatly >> appreciated. >> >> >> >> >> import numpy as np >> >> import netCDF4 >> >> >> def naive_fast(latvar,lonvar,lat0,lon0): >> >> # Read latitude and longitude from file into numpy arrays >> >> latvals = latvar[:] >> >> lonvals = lonvar[:] >> >> ny,nx = latvals.shape >> >> dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 >> >> minindex_flattened = dist_sq.argmin() # 1D index of min element >> >> iy_min,ix_min = np.unravel_index(minindex_flattened, latvals.shape) >> >> return iy_min,ix_min >> >> filename = "/Users/T_SFC.nc" >> >> ncfile = netCDF4.Dataset(filename, 'r') >> >> latvar = ncfile.variables['latitude'] >> >> lonvar = ncfile.variables['longitude'] >> >> >> iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) >> >> print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] >> >> ncfile.close() >> >> >> >> >> >> >> --------------------------------------------------------------------------- >> ValueError Traceback (most recent call >> last) >> /Applications/Canopy.app/appdata/canopy-1.3.0.1715.macosx-x86_64/Canopy.app/Contents/lib/python2.7/site-packages/IPython/utils/py3compat.pyc >> in execfile(fname, *where) >> 202 else: >> 203 filename = fname >> --> 204 __builtin__.execfile(filename, *where) >> >> /Users/latlon_to_closestgrid.py in () >> 22 lonvar = ncfile.variables['longitude'] >> 23 >> ---> 24 iy,ix = naive_fast(latvar, lonvar, -38.009, 146.438) >> 25 print 'Closest lat lon:', latvar[iy,ix], lonvar[iy,ix] >> 26 ncfile.close() >> >> /Users/latlon_to_closestgrid.py in naive_fast(latvar, lonvar, lat0, lon0) >> 12 latvals = latvar[:] >> 13 lonvals = lonvar[:] >> ---> 14 ny,nx = latvals.shape >> 15 dist_sq = (latvals-lat0)**2 + (lonvals-lon0)**2 >> 16 minindex_flattened = dist_sq.argmin() # 1D index of min >> element >> >> ValueError: need more than 1 value to unpack >> >> >> > It seems that latvals and lonvals should be a 2D array and you are > providing just a 1D array. > Maybe you could use numpy.meshgrid [1] to get 2D inputs from 1D arrays. > > [1] > http://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html > > >> >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 24 17:55:55 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Mar 2015 22:55:55 +0100 Subject: [Numpy-discussion] Vector math library integration In-Reply-To: References: Message-ID: On Tue, Mar 24, 2015 at 1:36 AM, ?????? ????? wrote: > Hello! > > I want to contribute to NumPy/SciPy, namely I am interested in > project Vector math library integration. I have good skills of C and > Python, so I can make it. Please, send me additional information about this > idea asap. > > Have a nice day! > Hi Akbar, welcome. There is quite some discussion on this topic in this thread: http://thread.gmane.org/gmane.comp.python.numeric.general/60080. That plus the general guidelines on the ideas page about how to structure your proposal and other requirements should be enough to get you started. Please ask if you have specific questions. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 24 18:03:39 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Mar 2015 23:03:39 +0100 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: Hi Saprative and Smruti, Sorry for the slow reply, I overlooked this thread. http://thread.gmane.org/gmane.comp.python.numeric.general/53805 and the discussion that followed (also linked from the ideas page) should give you some idea of what is required. If you want to start working on a patch I recommend to start small: https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%22Easy+Fix%22 There are also a number of related issues that you could look at: https://github.com/numpy/numpy/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+datetime. Trying to tackly one of those should give you an idea of the level of difficulty of this project (it's one of the harder ones on our list). Cheers, Ralf On Thu, Mar 19, 2015 at 11:15 AM, SMRUTI RANJAN SAHOO wrote: > i am also student developer. if i will get anything i will tell you. > > > On Thu, Mar 19, 2015 at 11:55 AM, Saprative Jana > wrote: > >> hi, >> I am Saprative .I am new to numpy devlopment. I want to work on the >> project of improving datetime functionality numpy project .I want to solve >> some related bugs and get started with the basics. As there is no irc >> channel for numpy so i am facing a problem of contacting with the mentors >> moreover there is no mentors mentioned for this project. So anybody who can >> help me out please contact with me. >> from, >> Saprative Jana >> (Mob: +919477325233) >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Mar 24 18:31:47 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 24 Mar 2015 23:31:47 +0100 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 In-Reply-To: References: Message-ID: On Tue, Mar 24, 2015 at 1:12 PM, O?uzhan ?nl? wrote: > Hi Nikolay, > > Thanks for pointing out that! It really helped. I think it looks better > and easier to review now. > > I appreciate any comment/feedback. My proposal is at > https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 > Regarding your schedule: - I would remove the parts related to benchmarks. There's no nice benchmark infrastructure in numpy itself at the moment (that's a separate GSoC idea), so the two times 1 week that you have are likely not enough to get something off the ground there. - The "implement a flexible interface" part will need some discussion, probably it makes sense to first draft a document (call it a NEP - Numpy Enhancement Proposal) that lays out the options and makes a proposal. - I wouldn't put "investigate accuracy differences" at the end. What if you find out there that you've been working on something for the whole summer that's not accurate enough? - The "researching possible options" I would do in the community bonding period - when the coding period starts you should have a fairly well-defined plan. - 3 weeks for implementing the interface looks optimistic. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From c99.smruti at gmail.com Wed Mar 25 01:00:43 2015 From: c99.smruti at gmail.com (SMRUTI RANJAN SAHOO) Date: Wed, 25 Mar 2015 10:30:43 +0530 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: you are saying that if i will find out this bugs ,then i will selected for gsoc 2015 ?? and where i will find my mentor?? On Wed, Mar 25, 2015 at 3:33 AM, Ralf Gommers wrote: > Hi Saprative and Smruti, > > Sorry for the slow reply, I overlooked this thread. > http://thread.gmane.org/gmane.comp.python.numeric.general/53805 and the > discussion that followed (also linked from the ideas page) should give you > some idea of what is required. > > If you want to start working on a patch I recommend to start small: > https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%22Easy+Fix%22 > There are also a number of related issues that you could look at: > https://github.com/numpy/numpy/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+datetime. > Trying to tackly one of those should give you an idea of the level of > difficulty of this project (it's one of the harder ones on our list). > > Cheers, > Ralf > > > > > On Thu, Mar 19, 2015 at 11:15 AM, SMRUTI RANJAN SAHOO < > c99.smruti at gmail.com> wrote: > >> i am also student developer. if i will get anything i will tell you. >> >> >> On Thu, Mar 19, 2015 at 11:55 AM, Saprative Jana > > wrote: >> >>> hi, >>> I am Saprative .I am new to numpy devlopment. I want to work on the >>> project of improving datetime functionality numpy project .I want to solve >>> some related bugs and get started with the basics. As there is no irc >>> channel for numpy so i am facing a problem of contacting with the mentors >>> moreover there is no mentors mentioned for this project. So anybody who can >>> help me out please contact with me. >>> from, >>> Saprative Jana >>> (Mob: +919477325233) >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shoyer at gmail.com Wed Mar 25 01:50:58 2015 From: shoyer at gmail.com (Stephan Hoyer) Date: Tue, 24 Mar 2015 22:50:58 -0700 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: The most recent discussion about datetime64 was back in March and April of last year: http://mail.scipy.org/pipermail/numpy-discussion/2014-March/thread.html#69554 http://mail.scipy.org/pipermail/numpy-discussion/2014-April/thread.html#69774 In addition to unfortunate timezone handling, datetime64 has a lot of bugs -- so many that I don't bother reporting them. But if anyone ever plans on working on them, I can certainly help to assemble a long list of the issues (many of these are mentioned in the above threads). Unfortunately, though I would love to see datetime64 fixed, I'm not really a suitable mentor for this role (I don't know C), -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 25 02:53:43 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Mar 2015 07:53:43 +0100 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: On Wed, Mar 25, 2015 at 6:00 AM, SMRUTI RANJAN SAHOO wrote: > you are saying that if i will find out this bugs ,then i will selected for > gsoc 2015 ?? > and where i will find my mentor?? > No, that's not what I'm saying. Submitting a patch is a requirement from the Python Software Foundation (which is the umbrella org under which we are participating in GSOC) as explained on https://wiki.python.org/moin/SummerOfCode/2015 and on https://github.com/scipy/scipy/wiki/GSoC-project-ideas Regarding mentoring: we have currently 5 mentors signed up in Melange, and will be able to find topic-specific ones if needed. Given that the last couple of years we received 2 slots, we will be able to provide a mentor. It's just not yet clear which one, because we have several mentors who could mentor a number of proposed projects. Cheers, Ralf > > On Wed, Mar 25, 2015 at 3:33 AM, Ralf Gommers > wrote: > >> Hi Saprative and Smruti, >> >> Sorry for the slow reply, I overlooked this thread. >> http://thread.gmane.org/gmane.comp.python.numeric.general/53805 and the >> discussion that followed (also linked from the ideas page) should give you >> some idea of what is required. >> >> If you want to start working on a patch I recommend to start small: >> https://github.com/numpy/numpy/issues?q=is%3Aopen+is%3Aissue+label%3A%22Easy+Fix%22 >> There are also a number of related issues that you could look at: >> https://github.com/numpy/numpy/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+datetime. >> Trying to tackly one of those should give you an idea of the level of >> difficulty of this project (it's one of the harder ones on our list). >> >> Cheers, >> Ralf >> >> >> >> >> On Thu, Mar 19, 2015 at 11:15 AM, SMRUTI RANJAN SAHOO < >> c99.smruti at gmail.com> wrote: >> >>> i am also student developer. if i will get anything i will tell you. >>> >>> >>> On Thu, Mar 19, 2015 at 11:55 AM, Saprative Jana < >>> saprativejana at gmail.com> wrote: >>> >>>> hi, >>>> I am Saprative .I am new to numpy devlopment. I want to work on the >>>> project of improving datetime functionality numpy project .I want to solve >>>> some related bugs and get started with the basics. As there is no irc >>>> channel for numpy so i am facing a problem of contacting with the mentors >>>> moreover there is no mentors mentioned for this project. So anybody who can >>>> help me out please contact with me. >>>> from, >>>> Saprative Jana >>>> (Mob: +919477325233) >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 25 02:56:17 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Mar 2015 07:56:17 +0100 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: On Wed, Mar 25, 2015 at 6:50 AM, Stephan Hoyer wrote: > The most recent discussion about datetime64 was back in March and April of > last year: > > http://mail.scipy.org/pipermail/numpy-discussion/2014-March/thread.html#69554 > > http://mail.scipy.org/pipermail/numpy-discussion/2014-April/thread.html#69774 > > In addition to unfortunate timezone handling, datetime64 has a lot of bugs > -- so many that I don't bother reporting them. But if anyone ever plans on > working on them, I can certainly help to assemble a long list of the issues > (many of these are mentioned in the above threads). > > Unfortunately, though I would love to see datetime64 fixed, I'm not really > a suitable mentor for this role (I don't know C), > Hi Stephan, thanks for at least considering to mentor. It's always possible to help out as a secondary mentor - even if you don't know C, you could provide valuable feedback because unlike most numpy devs you're actually *using* datetime64. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From c99.smruti at gmail.com Wed Mar 25 07:17:28 2015 From: c99.smruti at gmail.com (SMRUTI RANJAN SAHOO) Date: Wed, 25 Mar 2015 16:47:28 +0530 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: so may i know the links for mentor ,so that i can talk with them more ?? can you provide me the links please ??? On Wed, Mar 25, 2015 at 12:26 PM, Ralf Gommers wrote: > > > On Wed, Mar 25, 2015 at 6:50 AM, Stephan Hoyer wrote: > >> The most recent discussion about datetime64 was back in March and April >> of last year: >> >> http://mail.scipy.org/pipermail/numpy-discussion/2014-March/thread.html#69554 >> >> http://mail.scipy.org/pipermail/numpy-discussion/2014-April/thread.html#69774 >> >> In addition to unfortunate timezone handling, datetime64 has a lot of >> bugs -- so many that I don't bother reporting them. But if anyone ever >> plans on working on them, I can certainly help to assemble a long list of >> the issues (many of these are mentioned in the above threads). >> >> Unfortunately, though I would love to see datetime64 fixed, I'm not >> really a suitable mentor for this role (I don't know C), >> > > Hi Stephan, thanks for at least considering to mentor. It's always > possible to help out as a secondary mentor - even if you don't know C, you > could provide valuable feedback because unlike most numpy devs you're > actually *using* datetime64. > > Cheers, > Ralf > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Wed Mar 25 14:10:48 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 25 Mar 2015 19:10:48 +0100 Subject: [Numpy-discussion] Improve Numpy Datetime Functionality for Gsoc In-Reply-To: References: Message-ID: On Wed, Mar 25, 2015 at 12:17 PM, SMRUTI RANJAN SAHOO wrote: > so may i know the links for mentor ,so that i can talk with them more ?? > can you provide me the links please ??? > Hi Smruti, there are no links. I am one of the mentors, the other mentors are all reading this list. You'll get the most feedback on this list, so please ask relevant technical questions here. If you have further questions on administrative questions that you prefer to not post in public, you can email me privately. Cheers, Ralf > > On Wed, Mar 25, 2015 at 12:26 PM, Ralf Gommers > wrote: > >> >> >> On Wed, Mar 25, 2015 at 6:50 AM, Stephan Hoyer wrote: >> >>> The most recent discussion about datetime64 was back in March and April >>> of last year: >>> >>> http://mail.scipy.org/pipermail/numpy-discussion/2014-March/thread.html#69554 >>> >>> http://mail.scipy.org/pipermail/numpy-discussion/2014-April/thread.html#69774 >>> >>> In addition to unfortunate timezone handling, datetime64 has a lot of >>> bugs -- so many that I don't bother reporting them. But if anyone ever >>> plans on working on them, I can certainly help to assemble a long list of >>> the issues (many of these are mentioned in the above threads). >>> >>> Unfortunately, though I would love to see datetime64 fixed, I'm not >>> really a suitable mentor for this role (I don't know C), >>> >> >> Hi Stephan, thanks for at least considering to mentor. It's always >> possible to help out as a secondary mentor - even if you don't know C, you >> could provide valuable feedback because unlike most numpy devs you're >> actually *using* datetime64. >> >> Cheers, >> Ralf >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Mar 25 16:36:59 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 25 Mar 2015 13:36:59 -0700 Subject: [Numpy-discussion] Do you find this behavior surprising? Message-ID: >>> import numpy as np >>> a = np.arange(10) >>> flags = a.flags >>> flags C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False >>> flags.writeable = False >>> a.flags C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : False <--- WTF!!?? ALIGNED : True UPDATEIFCOPY : False I understand why this is happening, and that there is no other obvious way to make a.flags.writeable = False work than to have the return of a.flags linked to a under the hood. But I don't think this is documented anywhere, and wonder if perhaps it should. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Mar 25 16:45:07 2015 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 25 Mar 2015 16:45:07 -0400 Subject: [Numpy-discussion] Do you find this behavior surprising? In-Reply-To: References: Message-ID: I fail to see the wtf. flags = a.flags So, "flags" at this point is just an alias to "a.flags", just like any other variable in python "flags.writeable = False" would then be equivalent to "a.flags.writeable = False". There is nothing numpy-specific here. a.flags is mutable object. This is how Python works. Ben Root On Wed, Mar 25, 2015 at 4:36 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > > >>> import numpy as np > >>> a = np.arange(10) > >>> flags = a.flags > >>> flags > C_CONTIGUOUS : True > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > >>> flags.writeable = False > >>> a.flags > C_CONTIGUOUS : True > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : False <--- WTF!!?? > ALIGNED : True > UPDATEIFCOPY : False > > I understand why this is happening, and that there is no other obvious way > to make > > a.flags.writeable = False > > work than to have the return of a.flags linked to a under the hood. > > But I don't think this is documented anywhere, and wonder if perhaps it > should. > > Jaime > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes > de dominaci?n mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Wed Mar 25 17:11:17 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Wed, 25 Mar 2015 14:11:17 -0700 Subject: [Numpy-discussion] Do you find this behavior surprising? In-Reply-To: References: Message-ID: On Wed, Mar 25, 2015 at 1:45 PM, Benjamin Root wrote: > I fail to see the wtf. > > flags = a.flags > > So, "flags" at this point is just an alias to "a.flags", just like any > other variable in python > > "flags.writeable = False" would then be equivalent to "a.flags.writeable = > False". There is nothing numpy-specific here. a.flags is mutable object. > This is how Python works. > > Ben Root > Ah, yes indeed. If you think of it that way it does make all the sense in the world. But of course that is not what is actually going on, as flags is a single C int of the PyArrayObject struct, and a.flags is just a proxy built from it, and great coding contortions have to be made to have changes to the proxy rewritten into the owner array. I guess then the surprising behavior is this other one, which was the one I (wrongly) expected intuitively: >>> a = np.arange(10) >>> flags = a.flags >>> a.flags.writeable = False >>> flags C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True UPDATEIFCOPY : False This could be fixed to work properly, although it is probably not worth worrying much. Properties of properties are weird... Jaime -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben.root at ou.edu Wed Mar 25 17:34:52 2015 From: ben.root at ou.edu (Benjamin Root) Date: Wed, 25 Mar 2015 17:34:52 -0400 Subject: [Numpy-discussion] Do you find this behavior surprising? In-Reply-To: References: Message-ID: Ah, *that* example is surprising to me. Regardless of whether it is a C int of the PyArrayObject struct or not, the way it is presented at the python code level should make sense. From my perspective, a.flags is a mutable object of some sort. Updating it should act like a mutable object, not some other magical object that doesn't work like anything else in python. Ben Root On Wed, Mar 25, 2015 at 5:11 PM, Jaime Fern?ndez del R?o < jaime.frio at gmail.com> wrote: > On Wed, Mar 25, 2015 at 1:45 PM, Benjamin Root wrote: > >> I fail to see the wtf. >> > >> flags = a.flags >> >> So, "flags" at this point is just an alias to "a.flags", just like any >> other variable in python >> >> "flags.writeable = False" would then be equivalent to "a.flags.writeable >> = False". There is nothing numpy-specific here. a.flags is mutable object. >> This is how Python works. >> >> Ben Root >> > > Ah, yes indeed. If you think of it that way it does make all the sense in > the world. > > But of course that is not what is actually going on, as flags is a single > C int of the PyArrayObject struct, and a.flags is just a proxy built from > it, and great coding contortions have to be made to have changes to the > proxy rewritten into the owner array. > > I guess then the surprising behavior is this other one, which was the one > I (wrongly) expected intuitively: > > >>> a = np.arange(10) > >>> flags = a.flags > >>> a.flags.writeable = False > >>> flags > C_CONTIGUOUS : True > F_CONTIGUOUS : True > OWNDATA : True > WRITEABLE : True > ALIGNED : True > UPDATEIFCOPY : False > > This could be fixed to work properly, although it is probably not worth > worrying much. > > Properties of properties are weird... > > Jaime > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaime.frio at gmail.com Thu Mar 26 03:51:21 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Thu, 26 Mar 2015 00:51:21 -0700 Subject: [Numpy-discussion] flags attribute of ndarrays` Message-ID: Hi, I have just submitted a longish issue on the github repo, #5721 . After going in quite some detail over the flags attribute of ndarray's I have found several inconsistencies with the C API, and would like to make some changes. The details are in gh, but as a high level summary: 1. arr.flags.farray is almost certainly broken, and returns some useless result. I would like to make it behave like the C API's PyArray_ISFARRAY, even though that doesn't seem to be the intention of the original coder. 2. arr.flags.fortran is inconsitent with the C API's PyArray_ISFORTRAN. I think it should be modified to match it, but understand this may be too much of a backwards compatibility breach. 3. I would like for `arr.flags` to truly behave as a mutable property of `arr`. An explanation can be found on another thread's discussion with Benjamin Root earlier today. 4. I would like to match the Python and C version's of these as much as possible, and avoid future deviation, by actually using the C versions in the Python ones, even if this may introduce subtle behavior changes. Feedback is very welcome. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cengoguzhanunlu at gmail.com Thu Mar 26 07:58:33 2015 From: cengoguzhanunlu at gmail.com (=?UTF-8?B?T8SfdXpoYW4gw5xubMO8?=) Date: Thu, 26 Mar 2015 13:58:33 +0200 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 Message-ID: Hi, Sorry for a bit late reply. I will express my thoughts for Ralf's suggestions, respectively. > Regarding your schedule: > - I would remove the parts related to benchmarks. There's no nice benchmark > infrastructure in numpy itself at the moment (that's a separate GSoC idea), > so the two times 1 week that you have are likely not enough to get > something off the ground there. > - I think we can do a sample/demo benchmark only based on a library' speed performance over some basic set of data sets. Couldn't we? Instead of speed, it could be any other performance parameter, we can decide together. > - The "implement a flexible interface" part will need some discussion, > probably it makes sense to first draft a document (call it a NEP - Numpy > Enhancement Proposal) that lays out the options and makes a proposal. > To be realistic, I don't think I have enough time to complete an enhancement proposal. Maybe we can talk about it in the first half of April? - I wouldn't put "investigate accuracy differences" at the end. What if you > find out there that you've been working on something for the whole summer > that's not accurate enough? > However, we can't examine possible accuracy differences without having seen their real performance (in my case it is 'implementing an interface to libraries'). Isn't investigating possible libraries for numpy the fountain head of this project? Integrating chosen library can be possible by a small set of wrapping functions. > - The "researching possible options" I would do in the community bonding > period - when the coding period starts you should have a fairly > well-defined plan. > I agree with you at this point. After moving this to community bounding period, I can put a milestone like 'integrating chosen library to numpy' for 2 weeks. And we decide it would be better to remove benchmark part, then I would use that part for interface, probably. > - 3 weeks for implementing the interface looks optimistic. > It was an estimated time, I asked Julian's opinion about it and waiting his answer. You could be right, I am not familiar with codebase and exact set of functions to be improved. Since I prepared my schedule to serve as basis, I think it is understandable if something takes a bit longer or shorter as compared to what is written on schedule. > Cheers, > Ralf Your suggestions made me able think about project better. Thank you, Ralf. If you could share your opinions for my thoughts as well, I appreciate. My proposal is at https://gist.github.com/oguzhanunlu/1f8bf3ffc6ac5c420dd1 Cheers, Oguzhan -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcospc6 at gmail.com Thu Mar 26 08:05:49 2015 From: marcospc6 at gmail.com (Marcos .) Date: Thu, 26 Mar 2015 09:05:49 -0300 Subject: [Numpy-discussion] Proposal feedback Message-ID: Hello, I'm Marcos from Brazil! I've recently finalized a sketch of my proposal (on porting core functions of numpy from c to cython) and I would like some feedback from you. It can be seen on the following link: https://github.com/marcospc6/GSoC-Proposal/blob/master/gsocproposal.md Couldn't get the formatation right on github, but you get the idea. Thank you all -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtaylor.debian at googlemail.com Thu Mar 26 14:27:33 2015 From: jtaylor.debian at googlemail.com (Julian Taylor) Date: Thu, 26 Mar 2015 19:27:33 +0100 Subject: [Numpy-discussion] Asking proposal review/feedback for GSOC 15 In-Reply-To: References: Message-ID: <55144F95.9020908@googlemail.com> On 03/26/2015 12:58 PM, O?uzhan ?nl? wrote: > Hi, > > Sorry for a bit late reply. I will express my thoughts for Ralf's > suggestions, respectively. > > > Regarding your schedule: > - I would remove the parts related to benchmarks. There's no nice > benchmark > infrastructure in numpy itself at the moment (that's a separate GSoC > idea), > so the two times 1 week that you have are likely not enough to get > something off the ground there. > > > - I think we can do a sample/demo benchmark only based on a library' > speed performance over some basic set of data sets. Couldn't we? Instead > of speed, it could be any other performance parameter, we can decide > together. Creating benchmark and performance tracking tools should not be part of this project, but benchmarking is still important. You may have to research learn how to best benchmark this low level code, understand what influences their performance and we need a good set of benchmarks so we know in the end what we have gained by this project. I think the time allocation for this is good. > > > - The "implement a flexible interface" part will need some discussion, > probably it makes sense to first draft a document (call it a NEP - Numpy > Enhancement Proposal) that lays out the options and makes a proposal. > > > To be realistic, I don't think I have enough time to complete an > enhancement proposal. Maybe we can talk about it in the first half of > April? I think he means writing the nep proposal should be part of the project, you don't need to have a fleshed out one ready now. Though if you already have some ideas on how the interface might look like this should go into your proposal. > > - I wouldn't put "investigate accuracy differences" at the end. What > if you > find out there that you've been working on something for the whole > summer > that's not accurate enough? > > > However, we can't examine possible accuracy differences without having > seen their real performance (in my case it is 'implementing an interface > to libraries'). Isn't investigating possible libraries for numpy the > fountain head of this project? Integrating chosen library can be > possible by a small set of wrapping functions. The accuracy of the libraries can be investigated prior to their integration into numpy, and it should be done early to rule out or de-prioritize bad options. Documenting the trade-offs between performance and accuracy is one of the most important tasks. This also involves researching what kind of inputs to functions may be numerically problematic which depending on your prior numerics knowledge may take some time and should be accounted for. > > > - The "researching possible options" I would do in the community bonding > period - when the coding period starts you should have a fairly > well-defined plan. > > > I agree with you at this point. After moving this to community bounding > period, I can put a milestone like 'integrating chosen library to numpy' > for 2 weeks. And we decide it would be better to remove benchmark part, > then I would use that part for interface, probably. > > > - 3 weeks for implementing the interface looks optimistic. > > > It was an estimated time, I asked Julian's opinion about it and waiting > his answer. You could be right, I am not familiar with codebase and > exact set of functions to be improved. Since I prepared my schedule to > serve as basis, I think it is understandable if something takes a bit > longer or shorter as compared to what is written on schedule. > I am pretty bad as estimating times, but I think the implementation of interfaces can be done in three weeks if you are confident enough in your C coding abilities and are some experienced in maneuvering foreign code bases. From mshubhankar at yahoo.co.in Fri Mar 27 03:22:06 2015 From: mshubhankar at yahoo.co.in (Shubhankar Mohapatra) Date: Fri, 27 Mar 2015 07:22:06 +0000 (UTC) Subject: [Numpy-discussion] Mathematical functions in Numpy In-Reply-To: <5508880E.6050406@googlemail.com> References: <5508880E.6050406@googlemail.com> Message-ID: <1586874282.1666490.1427440926615.JavaMail.yahoo@mail.yahoo.com> Hello all,I have submitted the proposal. It would be very nice if you would please give it a read and provide me with your feebacks. I think i can make a file with different functions from different libraries such as intels vml , amd acml? with the existing sleef and yeppp libraries.I understand that these functions may become outdated after sometime and some other faster function may come up. Then the only way out is to update the file after certain periods. Please advice me with other methods if there are any.And please also tell me, how an interface working between the libraries and Numpy will be better than an internal file containg the souce codes. Thanks a lot for giving your time. mshubhankar/gsoc2015 | ? | | ? | | ? | ? | ? | ? | ? | | mshubhankar/gsoc2015Contribute to gsoc2015 development by creating an account on GitHub. | | | | View on github.com | Preview by Yahoo | | | | ? | On Wednesday, 18 March 2015 1:31 AM, Julian Taylor wrote: currently the math functions are wrapped via the generic PyUfunc_* functions in numpy/core/src/umath/loops.c.src which just apply some arbitrary function to a scalar from arbitrarily strided inputs. When adding variants one likely needs to add some special purpose loops to deal with the various special requirements of the vector math api's. This involves adding some special cases to the ufunc generation in numpy/core/code_generators/generate_umath.py and then implementing the new kernel functions. See e.g. this oldish PR, which changes the sqrt function from a PyUfunc_d_d function to a special loop to take advantage of the vectorized machine instructions: https://github.com/numpy/numpy/pull/3341 some things have changed a bit since then but it does show many of the files you probably need to look for this project. On 17.03.2015 19:51, Robert Kern wrote: > On Tue, Mar 17, 2015 at 6:29 PM, Matthieu Brucher > > wrote: >> >> Hi, >> >> These functions are defined in the C standard library! > > I think he's asking how to define numpy ufuncs. > >> 2015-03-17 18:00 GMT+00:00 Shubhankar Mohapatra > >: >> > Hello all, >> > I am a undergraduate and i am trying to do a project this time on > numppy in >> > gsoc. This project is about integrating vector math library classes > of sleef >> > and yeppp into numpy to make the mathematical functions faster. I have >> > already studied the new library classes but i am unable to find the > sin , >> > cos function definitions in the numpy souce code.Can someone please > help me >> > find the functions in the source code so that i can implement the new >> > library class into numpy. >> > Thanking you, >> > Shubhankar Mohapatra >> > >> > >> > _______________________________________________ >> > NumPy-Discussion mailing list >> > NumPy-Discussion at scipy.org >> > http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > >> >> >> >> -- >> Information System Engineer, Ph.D. >> Blog: http://matt.eifelle.com >> LinkedIn: http://www.linkedin.com/in/matthieubrucher >> Music band: http://liliejay.com/ >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > Robert Kern > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Mar 27 05:58:11 2015 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 27 Mar 2015 10:58:11 +0100 Subject: [Numpy-discussion] GSoC students: please read In-Reply-To: References: Message-ID: On Mon, Mar 23, 2015 at 10:21 PM, Ralf Gommers wrote: > Hi all, > > It's great to see that this year there are a lot of students interested in > doing a GSoC project with Numpy or Scipy. So far five proposals have been > submitted, and it looks like several more are being prepared now. I'd like > to give you a bit of advice as well as an idea of what's going to happen in > the few weeks. > > The deadline for submitting applications is 27 March. Don't wait until the > last day to submit your proposal! It has happened before that Melange was > overloaded and unavailable - the Google program admins will not accept that > as an excuse and allow you to submit later. So as soon as your proposal is > in good shape, put it in. You can still continue revising it. > > From 28 March until 13 April we will continue to interact with you, as we > request slots from the PSF and rank the proposals. We don't know how many > slots we will get this year, but to give you an impression: for the last > two years we got 2 slots. Hopefully we can get more this year, but that's > far from certain. > > Our ranking will be based on a combination of factors: the interaction > you've had with potential mentors and the community until now (and continue > to have), the quality of your submitted PRs, quality and projected impact > of your proposal, your enthusiasm, match with potential mentors, etc. We > will also organize a video call (Skype / Google Hangout / ...) with each of > you during the first half of April to be able to exchange ideas with a > higher communication bandwidth medium than email. > > Finally a note on mentoring: we will be able to mentor all proposals > submitted or suggested until now. Due to the large interest and technical > nature of a few topics it has in some cases taken a bit long to provide > feedback on draft proposals, however there are no showstoppers in this > regard. Please continue improving your proposals and working with your > potential mentors. > Hi all, just a heads up that I'll be offline until next Friday. Good luck everyone with the last-minute proposal edits. I plan to contact all students that submitted a GSoC application next weekend with more details on what will happen next and see when we can schedule a call. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.klemm at intel.com Fri Mar 27 10:23:56 2015 From: michael.klemm at intel.com (Klemm, Michael) Date: Fri, 27 Mar 2015 14:23:56 +0000 Subject: [Numpy-discussion] ANN: pyMIC v0.5 released Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F82430108C12@IRSMSX104.ger.corp.intel.com> Announcement: pyMIC v0.5 ========================= I'm happy to announce the release of pyMIC v0.5. pyMIC is a Python module to offload computation in a Python program to the Intel Xeon Phi coprocessor. It contains offloadable arrays and device management functions. It supports invocation of native kernels (C/C++, Fortran) and blends in with Numpy's array types for float, complex, and int data types. For more information and downloads please visit pyMIC's Github page: https://github.com/01org/pyMIC. You can find pyMIC's mailinglist at https://lists.01.org/mailman/listinfo/pymic. Full change log: ================= Version 0.5 ---------------------------- - Introduced new kernel API that avoids insane pointer unpacking. - pyMIC now uses libxstreams as the offload back-end (https://github.com/hfp/libxstream). - Added smart pointers to make handling of fake pointers easier. Version 0.4 ---------------------------- - New low-level API to allocate, deallocate, and transfer data (see OffloadStream). - Support for in-place binary operators. - New internal design to handle offloads. Version 0.3 ---------------------------- - Improved handling of libraries and kernel invocation. - Trace collection (PYMIC_TRACE=1, PYMIC_TRACE_STACKS={none,compact,full}). - Replaced the device-centric API with a stream API. - Refactoring to better match PEP8 recommendations. - Added support for int(int64) and complex(complex128) data types. - Reworked the benchmarks and examples to fit the new API. - Bugfix: fixed syntax errors in OffloadArray. Version 0.2 ---------------------------- - Small improvements to the README files. - New example: Singular Value Decomposition. - Some documentation for the API functions. - Added a basic testsuite for unit testing (WIP). - Bugfix: benchmarks now use the latest interface. - Bugfix: numpy.ndarray does not offer an attribute 'order'. - Bugfix: number_of_devices was not visible after import. - Bugfix: member offload_array.device is now initialized. - Bugfix: use exception for errors w/ invoke_kernel & load_library. Version 0.1 ---------------------------- Initial release. Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen, Deutschland Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk Registergericht: Muenchen HRB 47456 Ust.-IdNr./VAT Registration No.: DE129385895 Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 From lamblinp at iro.umontreal.ca Fri Mar 27 12:13:22 2015 From: lamblinp at iro.umontreal.ca (Pascal Lamblin) Date: Fri, 27 Mar 2015 17:13:22 +0100 Subject: [Numpy-discussion] Announcing Theano 0.7 Message-ID: <20150327161322.GC25634@bob.blip.be> =========================== Announcing Theano 0.7 =========================== This is a release for a major version, with lots of new features, bug fixes, and some interface changes (deprecated or potentially misleading features were removed). Upgrading to Theano 0.7 is recommended for everyone, but you should first make sure that your code does not raise deprecation warnings with the version you are currently using. For those using the bleeding edge version in the git repository, we encourage you to update to the `rel-0.7` tag. What's New ---------- Highlights: * Integration of CuDNN for 2D convolutions and pooling on supported GPUs * Too many optimizations and new features to count * Various fixes and improvements to scan * Better support for GPU on Windows * On Mac OS X, clang is used by default * Many crash fixes * Some bug fixes as well Description ----------- Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features: * tight integration with NumPy: a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions. * transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only). * efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs. * speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1+ exp(x)) for large values of x. * dynamic C code generation: evaluate expressions faster. * extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems. Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal). Resources --------- About Theano: http://deeplearning.net/software/theano/ Related projects: http://github.com/Theano/Theano/wiki/Related-projects About NumPy: http://numpy.scipy.org/ About SciPy: http://www.scipy.org/ Machine Learning Tutorial with Theano on Deep Architectures: http://deeplearning.net/tutorial/ Acknowledgments --------------- I would like to thank all contributors of Theano. For this particular release, many people have helped, and to list them all would be impractical. I would also like to thank users who submitted bug reports. Also, thank you to all NumPy and Scipy developers as Theano builds on their strengths. All questions/comments are always welcome on the Theano mailing-lists ( http://deeplearning.net/software/theano/#community ) -- Pascal From Jerome.Kieffer at esrf.fr Fri Mar 27 15:40:47 2015 From: Jerome.Kieffer at esrf.fr (Jerome Kieffer) Date: Fri, 27 Mar 2015 20:40:47 +0100 Subject: [Numpy-discussion] ANN: pyMIC v0.5 released In-Reply-To: <0DAB4B4FC42EAA41802458ADA9C2F82430108C12@IRSMSX104.ger.corp.intel.com> References: <0DAB4B4FC42EAA41802458ADA9C2F82430108C12@IRSMSX104.ger.corp.intel.com> Message-ID: <20150327204047.50ae9449833e67ce05fecaf7@esrf.fr> Hi, Interesting project. How close is the C++ kernel needed from OpenCL kernels ? Is it directly portable ? I have tested my OpenCL code (via pyopencl) on the Phi and I did not get better performances than the dual-hexacore Xeon (i.e. ~2x slower than a GPU). Cheers -- J?r?me Kieffer Data analysis unit - ESRF From tillsten at zedat.fu-berlin.de Sat Mar 28 08:28:11 2015 From: tillsten at zedat.fu-berlin.de (Till Stensitzki) Date: Sat, 28 Mar 2015 13:28:11 +0100 Subject: [Numpy-discussion] reaktionsschema Message-ID: A non-text attachment was scrubbed... Name: engsurface-3.svg Type: image/svg+xml Size: 19025 bytes Desc: not available URL: From blake.a.griffith at gmail.com Sun Mar 29 19:39:00 2015 From: blake.a.griffith at gmail.com (Blake Griffith) Date: Sun, 29 Mar 2015 18:39:00 -0500 Subject: [Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices Message-ID: I have an open PR which lets users control the checks on the input covariance matrix. The matrix is required to be symmetric and positve semi-definite (PSD). The current behavior is that NumPy raises a warning if the matrix is not PSD, and does not even check for symmetry. I added a symmetry check, which raises a warning when the input is not symmetric. And added two keyword args which users can use to turn off the checks/warnings when the matrix is ill formed. So this would only cause another new warning to be raised in existing code. This is needed because sometimes the covariance matrix is only *almost* symmetric or PSD due to roundoff error. Thoughts? PR: https://github.com/numpy/numpy/pull/5726 -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.klemm at intel.com Mon Mar 30 04:46:33 2015 From: michael.klemm at intel.com (Klemm, Michael) Date: Mon, 30 Mar 2015 08:46:33 +0000 Subject: [Numpy-discussion] ANN: pyMIC v0.5 released In-Reply-To: <20150327204047.50ae9449833e67ce05fecaf7@esrf.fr> References: <0DAB4B4FC42EAA41802458ADA9C2F82430108C12@IRSMSX104.ger.corp.intel.com> <20150327204047.50ae9449833e67ce05fecaf7@esrf.fr> Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F82430117BAF@IRSMSX104.ger.corp.intel.com> Hi Jerome, > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of Jerome Kieffer > Sent: Friday, March 27, 2015 8:41 PM > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] ANN: pyMIC v0.5 released > > Interesting project. How close is the C++ kernel needed from OpenCL > kernels ? That depends a bit on what the kernel does. If the kernel implements something really like a dgemm, it is just calling MKL's dgemm routine and passing the parameters into the routine. If the kernel does more, you would need regular C/C++ coding (with whatever is needed) plus a threading model such as OpenMP and TBB. > Is it directly portable ? I would say no, just because OpenCL does a lot in terms of parallelization of the kernel whereas pyMIC only gives you control over data transfer and passing control over to the coprocessor. Threading and SIMD vectorization then is done by the compiler and/or the programmers. > I have tested my OpenCL code (via pyopencl) on the Phi and I did not get > better performances than the dual-hexacore Xeon (i.e. ~2x slower than a > GPU). What type of code are you offloading? Cheers, -michael Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen, Deutschland Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk Registergericht: Muenchen HRB 47456 Ust.-IdNr./VAT Registration No.: DE129385895 Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 From josef.pktd at gmail.com Mon Mar 30 09:34:38 2015 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 30 Mar 2015 09:34:38 -0400 Subject: [Numpy-discussion] Behavior of np.random.multivariate_normal with bad covariance matrices In-Reply-To: References: Message-ID: On Sun, Mar 29, 2015 at 7:39 PM, Blake Griffith wrote: > I have an open PR which lets users control the checks on the input > covariance matrix. The matrix is required to be symmetric and positve > semi-definite (PSD). The current behavior is that NumPy raises a warning if > the matrix is not PSD, and does not even check for symmetry. > > I added a symmetry check, which raises a warning when the input is not > symmetric. And added two keyword args which users can use to turn off the > checks/warnings when the matrix is ill formed. So this would only cause > another new warning to be raised in existing code. > > This is needed because sometimes the covariance matrix is only *almost* > symmetric or PSD due to roundoff error. > > Thoughts? My only question is why is **exact** symmetry relevant? AFAIU A empirical covariance matrix might not be exactly symmetric unless we specifically force it to be. But I don't see why some roundoff errors that violate symmetry should be relevant. use allclose with floating point rtol or equivalent? Some user code might suddenly get irrelevant warnings. BTW: neg = (np.sum(u.T * v, axis=1) < 0) & (s > 0) doesn't need to be calculated if cov_psd is false. ----- some more: svd can hang if the values are not finite, i.e. nan or infs counter proposal would be to add a `check_valid` keyword with option ignore. warn, raise, and "fix" and raise an error if there are nans and check_valid is not ignore. --------- aside: np.random.multivariate_normal is only relevant if you have a new cov each call (or don't mind repeated possibly expensive calculations), so, I guess, adding checks by default won't upset many users. Josef > > > PR: https://github.com/numpy/numpy/pull/5726 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > From allanhaldane at gmail.com Mon Mar 30 18:59:34 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Mon, 30 Mar 2015 18:59:34 -0400 Subject: [Numpy-discussion] Rename arguments to np.clip and np.put Message-ID: <5519D556.4030600@gmail.com> Hello everyone, What does the list think of renaming the arguments of np.clip and np.put to match those of ndarray.clip/put? Currently the signatures are np.clip(a, a_min, a_max, out=None) ndarray.clip(a, min=None, max=None, out=None) np.put(a, ind, v, mode='raise') ndarray.put(indices, values, mode='raise') (The docstring for ndarray.clip is incorrect, too). I suggest the signatures might be changed to this: np.clip(a, min=None, max=None, out=None, **kwargs) np.put(a, indices, values, mode='raise') We can still take care of the old argument names for np.clip using **kwards, while showing a deprecation warning. I think that would be fully back-compatible. Note this makes np.clip more flexible as only one of min or max are needed now, just like ndarray.clip. np.put is trickier to keep back-compatible as it has two positional arguments. Someone who called `np.put(a, v=0, ind=1)` would be in trouble with this change, although I didn't find anyone on github doing so. I suppose to maintain back-compatibility we could make indices and values keyword args, and use the same kwargs trick as in np.clip, but that might be confusing since they're both required args. Any opinions or suggestions? Allan From jaime.frio at gmail.com Mon Mar 30 19:16:38 2015 From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=) Date: Mon, 30 Mar 2015 16:16:38 -0700 Subject: [Numpy-discussion] Rename arguments to np.clip and np.put In-Reply-To: <5519D556.4030600@gmail.com> References: <5519D556.4030600@gmail.com> Message-ID: On Mon, Mar 30, 2015 at 3:59 PM, Allan Haldane wrote: > Hello everyone, > > What does the list think of renaming the arguments of np.clip and np.put > to match those of ndarray.clip/put? Currently the signatures are > > np.clip(a, a_min, a_max, out=None) > ndarray.clip(a, min=None, max=None, out=None) > > np.put(a, ind, v, mode='raise') > ndarray.put(indices, values, mode='raise') > > (The docstring for ndarray.clip is incorrect, too). > > I suggest the signatures might be changed to this: > > np.clip(a, min=None, max=None, out=None, **kwargs) > np.put(a, indices, values, mode='raise') > > We can still take care of the old argument names for np.clip using > **kwards, while showing a deprecation warning. I think that would be > fully back-compatible. Note this makes np.clip more flexible as only one > of min or max are needed now, just like ndarray.clip. > > np.put is trickier to keep back-compatible as it has two positional > arguments. Someone who called `np.put(a, v=0, ind=1)` would be in > trouble with this change, although I didn't find anyone on github doing > so. I suppose to maintain back-compatibility we could make indices and > values keyword args, and use the same kwargs trick as in np.clip, but > that might be confusing since they're both required args. > Ideally we would want the signature to show as you describe it in the documentation, but during the deprecation period be something like e.g. np.put(a, indices=None, values=None, mode='raise', **kwargs) Not sure if that is even possible, maybe with functools.update_wrapper? Jaime > > Any opinions or suggestions? > > Allan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes de dominaci?n mundial. -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.klemm at intel.com Tue Mar 31 02:49:37 2015 From: michael.klemm at intel.com (Klemm, Michael) Date: Tue, 31 Mar 2015 06:49:37 +0000 Subject: [Numpy-discussion] How to Force Storage Order Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F824301304E8@IRSMSX104.ger.corp.intel.com> Dear all, I have found a bug in one of my codes and the way it passes a Numpy matrix to MKL's dgemm routine. Up to now I was assuming that the matrixes are using C order. I guess I have to correct this assumption :-). I have found that the numpy.linalg.svd algorithm creates the resulting U, sigma, and V matrixes with Fortran storage. Is there any way to force these kind of algorithms to not change the storage order? That would make passing the matrixes to the native dgemm operation much easier. Cheers, -michael Dr.-Ing. Michael Klemm Senior Application Engineer Software and Services Group Developer Relations Division Phone +49 89 9914 2340 Cell +49 174 2417583 Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen, Deutschland Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk Registergericht: Muenchen HRB 47456 Ust.-IdNr./VAT Registration No.: DE129385895 Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 From insertinterestingnamehere at gmail.com Tue Mar 31 03:11:56 2015 From: insertinterestingnamehere at gmail.com (Ian Henriksen) Date: Tue, 31 Mar 2015 07:11:56 +0000 Subject: [Numpy-discussion] How to Force Storage Order In-Reply-To: <0DAB4B4FC42EAA41802458ADA9C2F824301304E8@IRSMSX104.ger.corp.intel.com> References: <0DAB4B4FC42EAA41802458ADA9C2F824301304E8@IRSMSX104.ger.corp.intel.com> Message-ID: On Tue, Mar 31, 2015, 12:50 AM Klemm, Michael wrote: > Dear all, > > I have found a bug in one of my codes and the way it passes a Numpy matrix > to MKL's dgemm routine. Up to now I was assuming that the matrixes are > using C order. I guess I have to correct this assumption :-). > > I have found that the numpy.linalg.svd algorithm creates the resulting U, > sigma, and V matrixes with Fortran storage. Is there any way to force > these kind of algorithms to not change the storage order? That would make > passing the matrixes to the native dgemm operation much easier. > > Cheers, > -michael > > Dr.-Ing. Michael Klemm > Senior Application Engineer > Software and Services Group > Developer Relations Division > Phone +49 89 9914 2340 > Cell +49 174 2417583 > > > Intel GmbH > Dornacher Strasse 1 > 85622 Feldkirchen/Muenchen, Deutschland > Sitz der Gesellschaft: Feldkirchen bei Muenchen > Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk > Registergericht: Muenchen HRB 47456 > Ust.-IdNr./VAT Registration No.: DE129385895 > Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion Why not just call the algorithm on the transpose of the original array? That will transpose and reverse the order of the SVD, but taking the transposes once the algorithm is finished will ensure they are C ordered. You could also use np.ascontiguousarray on the output arrays, though that results in unnecessary copies that change the memory layout. Best of luck! -Ian Henriksen -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Mar 31 03:36:47 2015 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 31 Mar 2015 09:36:47 +0200 Subject: [Numpy-discussion] How to Force Storage Order In-Reply-To: References: <0DAB4B4FC42EAA41802458ADA9C2F824301304E8@IRSMSX104.ger.corp.intel.com> Message-ID: <1427787407.4695.14.camel@sipsolutions.net> On Di, 2015-03-31 at 07:11 +0000, Ian Henriksen wrote: > On Tue, Mar 31, 2015, 12:50 AM Klemm, Michael > wrote: > > > Dear all, > > I have found a bug in one of my codes and the way it passes a > Numpy matrix to MKL's dgemm routine. Up to now I was assuming > that the matrixes are using C order. I guess I have to > correct this assumption :-). > > I have found that the numpy.linalg.svd algorithm creates the > resulting U, sigma, and V matrixes with Fortran storage. Is > there any way to force these kind of algorithms to not change > the storage order? That would make passing the matrixes to > the native dgemm operation much easier. > > Cheers, > -michael > > Dr.-Ing. Michael Klemm > Senior Application Engineer > Software and Services Group > Developer Relations Division > Phone +49 89 9914 2340 > Cell +49 174 2417583 > > > Intel GmbH > Dornacher Strasse 1 > 85622 Feldkirchen/Muenchen, Deutschland > Sitz der Gesellschaft: Feldkirchen bei Muenchen > Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, > Douglas Lusk > Registergericht: Muenchen HRB 47456 > Ust.-IdNr./VAT Registration No.: DE129385895 > Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > Why not just call the algorithm on the transpose of the original > array? That will transpose and reverse the order of the SVD, but > taking the transposes once the algorithm is finished will ensure they > are C ordered. You could also use np.ascontiguousarray on the output > arrays, though that results in unnecessary copies that change the > memory layout. > Best of luck! > Frankly, I would suggest to call some function like that in any case (or at least assert the memory order or have a test suit), just to make sure that if some implementation detail changes you are not relying on "this function will give me back a fortran ordered array". Of course you can do tricks to make this extra function call a no-op, since it will do nothing if the array is already in the correct memory order. But the quick check that everything is in order should not matter speed wise. To be honest, after doing an SVD, even the copy likely doesn't matter speed wise.... - Sebastian > -Ian Henriksen > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: This is a digitally signed message part URL: From michael.klemm at intel.com Tue Mar 31 05:28:58 2015 From: michael.klemm at intel.com (Klemm, Michael) Date: Tue, 31 Mar 2015 09:28:58 +0000 Subject: [Numpy-discussion] How to Force Storage Order In-Reply-To: <1427787407.4695.14.camel@sipsolutions.net> References: <0DAB4B4FC42EAA41802458ADA9C2F824301304E8@IRSMSX104.ger.corp.intel.com> <1427787407.4695.14.camel@sipsolutions.net> Message-ID: <0DAB4B4FC42EAA41802458ADA9C2F82430131A23@IRSMSX104.ger.corp.intel.com> Dear all, > -----Original Message----- > From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion- > bounces at scipy.org] On Behalf Of Sebastian Berg > Sent: Tuesday, March 31, 2015 9:37 AM > To: numpy-discussion at scipy.org > Subject: Re: [Numpy-discussion] How to Force Storage Order > > Frankly, I would suggest to call some function like that in any case (or > at least assert the memory order or have a test suit), just to make sure > that if some implementation detail changes you are not relying on "this > function will give me back a fortran ordered array". > Of course you can do tricks to make this extra function call a no-op, > since it will do nothing if the array is already in the correct memory > order. But the quick check that everything is in order should not matter > speed wise. To be honest, after doing an SVD, even the copy likely doesn't > matter speed wise.... After Ian response I have decided to go for properly invoking dgemm with the right transpose parameters if the data is in transposed format. That's way more generic and works even if Numpy decided to switch again from C to F order or vice versa. All I assume now is that a temporary matrix is C order, but that I can control when constructing the matrix. Thanks for helping on this one! Kind regards, -michael Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen, Deutschland Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk Registergericht: Muenchen HRB 47456 Ust.-IdNr./VAT Registration No.: DE129385895 Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052 From allanhaldane at gmail.com Tue Mar 31 10:54:53 2015 From: allanhaldane at gmail.com (Allan Haldane) Date: Tue, 31 Mar 2015 10:54:53 -0400 Subject: [Numpy-discussion] Rename arguments to np.clip and np.put In-Reply-To: References: <5519D556.4030600@gmail.com> Message-ID: <551AB53D.6050505@gmail.com> On 03/30/2015 07:16 PM, Jaime Fern?ndez del R?o wrote: > On Mon, Mar 30, 2015 at 3:59 PM, Allan Haldane > wrote: > > Hello everyone, > > What does the list think of renaming the arguments of np.clip and np.put > to match those of ndarray.clip/put? Currently the signatures are > > np.clip(a, a_min, a_max, out=None) > ndarray.clip(a, min=None, max=None, out=None) > > np.put(a, ind, v, mode='raise') > ndarray.put(indices, values, mode='raise') > > (The docstring for ndarray.clip is incorrect, too). > > I suggest the signatures might be changed to this: > > np.clip(a, min=None, max=None, out=None, **kwargs) > np.put(a, indices, values, mode='raise') > > We can still take care of the old argument names for np.clip using > **kwards, while showing a deprecation warning. I think that would be > fully back-compatible. Note this makes np.clip more flexible as only one > of min or max are needed now, just like ndarray.clip. > > np.put is trickier to keep back-compatible as it has two positional > arguments. Someone who called `np.put(a, v=0, ind=1)` would be in > trouble with this change, although I didn't find anyone on github doing > so. I suppose to maintain back-compatibility we could make indices and > values keyword args, and use the same kwargs trick as in np.clip, but > that might be confusing since they're both required args. > > > Ideally we would want the signature to show as you describe it in the > documentation, but during the deprecation period be something like e.g. > > np.put(a, indices=None, values=None, mode='raise', **kwargs) > > Not sure if that is even possible, maybe with functools.update_wrapper? > > Jaime In Python 3 the __signature__ attribute does exactly what we want. IPython 3.0.0 has also backported this for its docstrings internally for Python 2. But that still leaves out Python 2 users who don't read docstrings in IPython3+. functools.update_wrapper uses the inspect module (as does IPython), and the inspect module looks up func.func_code.co_varnames which is a readonly python internals object, so that doesn't work. How about making indices/value be keyword args, use __signature__ to take care of Python 3, and then add a note in the docstring for Python 2 users that they are only temporarily keyword args during the deprecation period? > Any opinions or suggestions? > > Allan > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > > > > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus > planes de dominaci?n mundial. > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >