Mailman 3 April 2020 - NumPy-Discussion

mixed mode arithmetic
by Neal Becker 11 Jul '23

11 Jul '23

I've been browsing the numpy source. I'm wondering about mixed-mode arithmetic on arrays. I believe the way numpy handles this is that it never does mixed arithmetic, but instead converts arrays to a common type. Arguably, that might be efficient for a mix of say, double and float. Maybe not. But for a mix of complex and a scalar type (say, CDouble * Double), it's clearly suboptimal in efficiency. So, do I understand this correctly? If so, is that something we should improve?

4 6

Invalid value encoutered : how to prevent numpy.where to do this?
by Eric Emsellem 18 Feb '23

18 Feb '23

Dear all, I have a code using lots of "numpy.where" to make some constrained calculations as in: data = arange(10) result = np.where(data == 0, 0., 1./data) # or data1 = arange(10) data2 = arange(10)+1.0 result = np.where(data1 > data2, np.sqrt(data1-data2), np.sqrt(data2-data2)) which then produces warnings like: /usr/bin/ipython:1: RuntimeWarning: invalid value encountered in sqrt or for the first example: /usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in divide How do I avoid these messages to appear? I know that I could in principle use numpy.seterr. However, I do NOT want to remove these warnings for other potential divide/multiply/sqrt etc errors. Only when I am using a "where", to in fact avoid such warnings! Note that the warnings only happen once, but since I am going to release that code, I would like to avoid the user to get such messages which are irrelevant here (because I am testing, with the where, when NOT to divide by zero or take a sqrt of a negative number). thanks! Eric

5 4

Re: [Numpy-discussion] example reading binary Fortran file
by Neil Martinsen-Burrell 22 Jul '21

22 Jul '21

David Froger <david.froger.info <at> gmail.com> writes: > Hy,My question is about reading Fortran binary file (oh no this question > again...) I've posted this before, but I finally got it cleaned up for the Cookbook. For this purpose I use a subclass of file that has methods for reading unformatted Fortran data. See http://www.scipy.org/Cookbook/FortranIO/FortranFile. I'd gladly see this in numpy or scipy somewhere, but I'm not sure where it belongs. > program makeArray > implicit none > integer,parameter:: nx=10,ny=20 > real(4),dimension(nx,ny):: ux,uy,p > integer :: i,j > open(11,file='uxuyp.bin',form='unformatted') > do i = 1,nx > do j = 1,ny > ux(i,j) = real(i*j) > uy(i,j) = real(i)/real(j) > p (i,j) = real(i) + real(j) > enddo > enddo > write(11) ux,uy > write(11) p > close(11) > end program makeArray When I run the above program compiled with gfortran on my Intel Mac, I can read it back with:: >>> import numpy as np >>> from fortranfile import FortranFile >>> f=FortranFile('uxuyp.bin', endian='<') >>> uxuy = f.readReals(prec='f') # 'f' for default reals >>> len(uxuy) 400 >>> ux = np.array(uxuy[:200]).reshape((20,10)).T >>> uy = np.array(uxuy[200:]).reshape((20,10)).T >>> p = f.readReals('f').reshape((20,10)).T >>> ux array([[ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20.], [ 2., 4., 6., 8., 10., 12., 14., 16., 18., 20., 22., 24., 26., 28., 30., 32., 34., 36., 38., 40.], [ 3., 6., 9., 12., 15., 18., 21., 24., 27., 30., 33., 36., 39., 42., 45., 48., 51., 54., 57., 60.], [ 4., 8., 12., 16., 20., 24., 28., 32., 36., 40., 44., 48., 52., 56., 60., 64., 68., 72., 76., 80.], [ 5., 10., 15., 20., 25., 30., 35., 40., 45., 50., 55., 60., 65., 70., 75., 80., 85., 90., 95., 100.], [ 6., 12., 18., 24., 30., 36., 42., 48., 54., 60., 66., 72., 78., 84., 90., 96., 102., 108., 114., 120.], [ 7.,Proxy-Connection: keep-alive Cache-Control: max-age=0 14., 21., 28., 35., 42., 49., 56., 63., 70., 77., 84., 91., 98., 105., 112., 119., 126., 133., 140.], [ 8., 16., 24., 32., 40., 48., 56., 64., 72., 80., 88., 96., 104., 112., 120., 128., 136., 144., 152., 160.], [ 9., 18., 27., 36., 45., 54., 63., 72., 81., 90., 99., 108., 117., 126., 135., 144., 153., 162., 171., 180.], [ 10., 20., 30., 40., 50., 60., 70., 80., 90., 100., 110., 120., 130., 140., 150., 160., 170., 180., 190., 200.]]) >>> uy array([[ 1. , 0.5 , 0.33333334, 0.25 , 0.2 , 0.16666667, 0.14285715, 0.125 , 0.11111111, 0.1 , 0.09090909, 0.08333334, 0.07692308, 0.07142857, 0.06666667, 0.0625 , 0.05882353, 0.05555556, 0.05263158, 0.05 ], [ 2. , 1. , 0.66666669, 0.5 , 0.40000001, 0.33333334, 0.2857143 , 0.25 , 0.22222222, 0.2 , 0.18181819, 0.16666667, 0.15384616, 0.14285715, 0.13333334, 0.125 , 0.11764706, 0.11111111, 0.10526316, 0.1 ], [ 3. , 1.5 , 1. , 0.75 , 0.60000002, 0.5 , 0.42857143, 0.375 , 0.33333334, 0.30000001, 0.27272728, 0.25 , 0.23076923, 0.21428572, 0.2 , 0.1875 , 0.17647059, 0.16666667, 0.15789473, 0.15000001], [ 4. , 2. , 1.33333337, 1. , 0.80000001, 0.66666669, 0.5714286 , 0.5 , 0.44444445, 0.40000001, 0.36363637, 0.33333334, 0.30769232, 0.2857143 , 0.26666668, 0.25 , 0.23529412, 0.22222222, 0.21052632, 0.2 ], [ 5. , 2.5 , 1.66666663, 1.25 , 1. , 0.83333331, 0.71428573, 0.625 , 0.55555558, 0.5 , 0.45454547, 0.41666666, 0.38461539, 0.35714287, 0.33333334, 0.3125 , 0.29411766, 0.27777779, 0.2631579 , 0.25 ], [ 6. , 3. , 2. , 1.5 , 1.20000005, 1. , 0.85714287, 0.75 , 0.66666669, 0.60000002, 0.54545456, 0.5 , 0.46153846, 0.42857143, 0.40000001, 0.375 , 0.35294119, 0.33333334, 0.31578946, 0.30000001], [ 7. , 3.5 , 2.33333325, 1.75 , 1.39999998, 1.16666663, 1. , 0.875 , 0.77777779, 0.69999999, 0.63636363, 0.58333331, 0.53846157, 0.5 , 0.46666667, 0.4375 , 0.41176471, 0.3888889 , 0.36842105, 0.34999999], [ 8. , 4. , 2.66666675, 2. , 1.60000002, 1.33333337, 1.14285719, 1. , 0.8888889 , 0.80000001, 0.72727275, 0.66666669, 0.61538464, 0.5714286 , 0.53333336, 0.5 , 0.47058824, 0.44444445, 0.42105263, 0.40000001], [ 9. , 4.5 , 3. , 2.25 , 1.79999995, 1.5 , 1.28571427, 1.125 , 1. , 0.89999998, 0.81818181, 0.75 , 0.69230771, 0.64285713, 0.60000002, 0.5625 , 0.52941179, 0.5 , 0.47368422, 0.44999999], [ 10. , 5. , 3.33333325, 2.5 , 2. , 1.66666663, 1.42857146, 1.25 , 1.11111116, 1. , 0.90909094, 0.83333331, 0.76923078, 0.71428573, 0.66666669, 0.625 , 0.58823532, 0.55555558, 0.52631581, 0.5 ]]) >>> p array([[ 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21.], [ 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22.], [ 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23.], [ 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24.], [ 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25.], [ 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26.], [ 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27.], [ 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28.], [ 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29.], [ 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30.]]) Note that you have to provide the shape information for ux and uy because fortran writes them together as a stream of 400 numbers. -Neil

10 10

help translating into Russian
by Inessa Pawson 19 May '20

19 May '20

Our collaboration with the students and faculty from the Master’s program in Survey Methodology at the University of Michigan and the University of Maryland is underway. We are looking for a volunteer to translate the survey questionnaire into Russian. If you are available, or you know someone who would be interested to help, please leave a comment here: https://github.com/numpy/numpy-surveys/issues/1. -- Every good wish, *Inessa Pawson * Executive Director Albus Code

1 1

Documentation Team Meeting - Monday April 6
by Melissa Mendonça 16 May '20

16 May '20

Hi all, This is a reminder that we're having a Documentation Team Meeting next monday, April 6th, at 3PM UTC**. If you wish to join on Zoom, you need to use this link https://zoom.us/j/420005230 Here's the permanent hackmd document with the meeting notes: https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg <https://www.google.com/url?q=https%3A%2F%2Fhackmd.io%2FoB_boakvRqKR-_2jRV-Q…> Hope to see you around! ** You can click this link to get the correct time at your timezone: https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentati… - Melissa

2 3

Update the Code of Conduct Committee Membership (new members wanted)
by Sebastian Berg 10 May '20

10 May '20

Hi all, it has come up in the last community call that many of our committee membership lists have not been updated in a while. This is not a big issue as such. But, while these committees are not very active on a day-to-day basis, they are an important part of the community and it is better to update them regularly and thus also ensure they remain representative of the community. We would like to start by updating the members of the Code of Conduct (CoC) committee. The CoC committee is in charge of responding and following up to any reports of CoC breaches, as stated in: https://docs.scipy.org/doc/numpy/dev/conduct/code_of_conduct.html#incident-… If you are interested in or happy to serve on our CoC committee please let me or e.g. Ralf Gommers know, join the next community meeting (April 29th, 11:00PDT/18:00UTC), or reply on the list. I hope we will be able to discuss and reach a consensus between those interested and involved quickly (possibly already on the next community call). In either case, before any changes they will be run by the mailing list to ensure community consensus. Cheers, Sebastian

3 3

Feelings about type aliases in NumPy
by Joshua Wilson 09 May '20

09 May '20

Hey everyone, Over in numpy-stubs we've been working on typing "array like": https://github.com/numpy/numpy-stubs/pull/66 It would be nice if the type were public so that downstream projects could use it (e.g. it would be very helpful in SciPy). Originally the plan was to only make it publicly available at typing time and not runtime, which would mean that no changes to NumPy are necessary; see https://github.com/numpy/numpy-stubs/pull/66#issuecomment-618784833 for more information on how that works. But, Stephan pointed out that it might be confusing to users for objects to only exist at typing time, so we came around to the question of whether people are open to the idea of including the type aliases in NumPy itself. Ralf's concrete proposal was to make a module numpy.types (or maybe numpy.typing) to hold the aliases so that they don't pollute the top-level namespace. The module would initially contain the types - ArrayLike - DtypeLike - (maybe) ShapeLike Note that we would not need to make changes to NumPy right away; instead it would probably be done when numpy-stubs is merged into NumPy itself. What do people think? - Josh

6 8

Season of Docs technical writer
by Ben Nathanson 30 Apr '20

30 Apr '20

I look forward to participating in this year's Season of Docs. Though it's early, I'm eager to start a conversation; I've posted the webpage https://bennathanson.com/numpy2020 to share my thoughts on contributing.

2 1

Deprecate Promotion of numbers to strings?
by Sebastian Berg 30 Apr '20

30 Apr '20

Hi all, in https://github.com/numpy/numpy/pull/15925 I propose to deprecate promotion of strings and numbers. I have to double check whether this has a large effect on pandas, but it currently seems to me that it will be reasonable. This means that `np.promote_types("S", "int8")`, etc. will lead to an error instead of returning `"S4"`. For the user, I believe the two main visible changes are that: np.array(["string", 0]) will stop creating a string array and return either an `object` array or give an error (object array would be the default currently). Another larger visible change will be code such as: np.concatenate([np.array(["string"]), np.array([2])]) will result in an error instead of returning a string array. (Users will have to cast manually here.) The alternative is to return an object array also for the concatenate example. I somewhat dislike that because `object` is not homogeneously typed and we thus lose type information. This also affects functions that wish to cast inputs to a common type (ufuncs also do this sometimes). A further example of this and discussion is at the end of the mail [1]. So the first question is whether we can form an agreement that an error is the better choice for `concatenate` and `np.promote_types()`. I.e. there is no one dtype that can faithfully represent both strings and integers. (This is currently the case e.g. for datetime64 and float64.) The second question is what to do for: np.array(["string", 0]) which currently always returns strings. Arguably, it must also either return an `object` array, or raise an error (requiring the user to pick string or object using `dtype=object`). The default would be to create a FutureWarning that an `object` array will be returned for `np.asarray(["string", 0])` in the future. But if we know already that we prefer an error, it would be better to give a DeprecationWarning right away. (It just does not seem nice to change the same thing twice even if the workaround is identical.) Cheers, Sebastian [1] A second more in-depth point is that code such as: common_dtype = np.result_type(arr1, arr2) # or promote_types arr1 = arr1.astype(common_dtype, copy=False) arr2 = arr2.astype(common_dtype, copy=False) will currently use `string` in this case while it would error in the future. This already fails with other type combinations such as `datetime64` and `float64` at the moment. The main alternative to this proposal is to return `object` for the common dtype, since an object array is not homogeneously typed, it arguably can represent both inputs. I do not quite like this choice personally because in the above example, it may be that the next line is something like: return arr1 * arr2 in which case, the preferred return may be `str` and not `object`. We currently never promote to `object` unless one of the arrays is already an `object` array, and that seems like the right choice to me.

3 3

Proposal: add `force=` or `copy=` kwarg to `__array__` interface
by Juan Nunez-Iglesias 29 Apr '20

29 Apr '20

Hello NumPy-ers! The __array__ method is a great little tool to allow interoperability with NumPy. Briefly, calling `np.array()` or `np.asarray()` on an object with an `__array__` method, one can get a NumPy representation of that object, which may or may not involve data copying (this is up to the object’s implementation of `__array__`). Some references: https://numpy.org/devdocs/user/basics.dispatch.html <https://numpy.org/devdocs/user/basics.dispatch.html> https://docs.scipy.org/doc/numpy/reference/arrays.classes.html#numpy.class.… <https://docs.scipy.org/doc/numpy/reference/arrays.classes.html#numpy.class.…> https://numpy.org/devdocs/reference/generated/numpy.array.html <https://numpy.org/devdocs/reference/generated/numpy.array.html> https://numpy.org/devdocs/reference/generated/numpy.asarray.html <https://numpy.org/devdocs/reference/generated/numpy.asarray.html> (I couldn’t find an authoritative guide on good and bad practices with `__array__`, btw.) For people writing e.g. visualisation libraries, this is a wonderful thing, because if we know how to visualise NumPy arrays, we can suddenly visualise anything with an `__array__` method. As an example, napari, while not being aware of dask, can visualise large dask arrays out of the box, which allows us to view 100GB out-of-core datasets easily. However, in many cases, instantiating a NumPy array is an expensive operation, for example copying an array from GPU to CPU memory, or involves substantial loss of information. Some library authors are reluctant to allow implicit execution of such an operation, such as PyOpenCL [1], PyTorch [2], or even scipy.sparse. My proposal is to add an optional argument to `__array__` that would signal to the downstream library that we *really* want a NumPy array and are willing to wait for it. In the PyTorch issue I proposed `force=True`, and they are somewhat receptive of this, but, reading more about the existing NumPy APIs, I think `copy=True` would be a nice alternative: - np.array already has a copy= keyword argument. Under this proposal, it would attempt to pass it to the downstream library, and, if that failed, it would try again without it and run its own copy. - np.asarray could get a new copy= keyword argument that would match np.array’s. - It would neatly express the idea that the array is going to e.g. get passed around between devices. Or, we could just go with `force=`. One bit of expressivity we would miss is “copy if necessary, but otherwise don’t bother”, but there are workarounds to this. What do people think? I would be happy to write a PR and/or NEP for this if there is general consensus that this would be useful. Thanks, Juan. Refs: [1]: https://github.com/inducer/pyopencl/pull/301 [2]: https://github.com/pytorch/pytorch/issues/36560

7 17