Changes in PyArray_FromAny between 1.5.x and 1.6.x
Hello,
In trying to upgrade NumPy within Sage, we notices some differences in behavior between 1.5 and 1.6. In particular, in 1.5, we have
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.5) sage: numpy.array(float(f)) array(0.5)
In 1.6, we get the following,
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.500000000000000, dtype=object)
This seems to be do to the changes in PyArray_FromAny introduced in https://github.com/mwhansen/numpy/commit/2635398db3f26529ce2aaea4028a8118844... . In particular, _array_find_type used to be used to query our __array_interface__ attribute, and it no longer seems to work. Is there a way to get the old behavior with the current code?
Mike
On Mon, May 28, 2012 at 3:15 AM, Mike Hansen mhansen@gmail.com wrote:
In trying to upgrade NumPy within Sage, we notices some differences in behavior between 1.5 and 1.6. In particular, in 1.5, we have
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.5) sage: numpy.array(float(f)) array(0.5)
In 1.6, we get the following,
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.500000000000000, dtype=object)
This seems to be do to the changes in PyArray_FromAny introduced in https://github.com/mwhansen/numpy/commit/2635398db3f26529ce2aaea4028a8118844... . In particular, _array_find_type used to be used to query our __array_interface__ attribute, and it no longer seems to work. Is there a way to get the old behavior with the current code?
Any ideas?
Thanks, Mike
On 06/04/2012 09:06 PM, Mike Hansen wrote:
On Mon, May 28, 2012 at 3:15 AM, Mike Hansenmhansen@gmail.com wrote:
In trying to upgrade NumPy within Sage, we notices some differences in behavior between 1.5 and 1.6. In particular, in 1.5, we have
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.5) sage: numpy.array(float(f)) array(0.5)
In 1.6, we get the following,
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.500000000000000, dtype=object)
This seems to be do to the changes in PyArray_FromAny introduced in https://github.com/mwhansen/numpy/commit/2635398db3f26529ce2aaea4028a8118844... . In particular, _array_find_type used to be used to query our __array_interface__ attribute, and it no longer seems to work. Is there a way to get the old behavior with the current code?
No idea. If you want to spend the time to fix this properly, you could implement PEP 3118 and use that instead to export your array data (which can be done from Cython using __getbuffer__ on a Cython class).
Dag
Can you raise an issue on the Github issue tracker for NumPy? These issues will be looked at more closely. This kind of change should not have made it in to the release.
<offtopic> Given the lack of availability of time from enough experts in NumPy, this is the sort of thing that can happen. I was not able to guide development of NumPy appropriately at my old job. That's a big reason I left. I still have more to do than just guide NumPy now, but making sure NumPy is maintained is a big part of what I am doing and why both NumFOCUS and Continuum Analytics exist. I am very hopeful that we can avoid this sort of regression in the future. More tests will help. <offtopic>
I think it's important to note that there are many people who will be in the same boat of upgrading to 1.6 over the coming year and there are going to be other little issues like this we will need to address.
Travis
On Jun 4, 2012, at 4:12 PM, Dag Sverre Seljebotn wrote:
On 06/04/2012 09:06 PM, Mike Hansen wrote:
On Mon, May 28, 2012 at 3:15 AM, Mike Hansenmhansen@gmail.com wrote:
In trying to upgrade NumPy within Sage, we notices some differences in behavior between 1.5 and 1.6. In particular, in 1.5, we have
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.5) sage: numpy.array(float(f)) array(0.5)
In 1.6, we get the following,
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.500000000000000, dtype=object)
This seems to be do to the changes in PyArray_FromAny introduced in https://github.com/mwhansen/numpy/commit/2635398db3f26529ce2aaea4028a8118844... . In particular, _array_find_type used to be used to query our __array_interface__ attribute, and it no longer seems to work. Is there a way to get the old behavior with the current code?
No idea. If you want to spend the time to fix this properly, you could implement PEP 3118 and use that instead to export your array data (which can be done from Cython using __getbuffer__ on a Cython class).
Dag _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
On Mon, Jun 4, 2012 at 9:30 PM, Travis Oliphant travis@continuum.io wrote:
Can you raise an issue on the Github issue tracker for NumPy? These issues will be looked at more closely. This kind of change should not have made it in to the release.
Thanks Travis! I've made this https://github.com/numpy/numpy/issues/291
Mike
On Mon, Jun 4, 2012 at 10:12 PM, Dag Sverre Seljebotn d.s.seljebotn@astro.uio.no wrote:
On 06/04/2012 09:06 PM, Mike Hansen wrote:
On Mon, May 28, 2012 at 3:15 AM, Mike Hansenmhansen@gmail.com wrote:
In trying to upgrade NumPy within Sage, we notices some differences in behavior between 1.5 and 1.6. In particular, in 1.5, we have
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.5) sage: numpy.array(float(f)) array(0.5)
In 1.6, we get the following,
sage: f = 0.5 sage: f.__array_interface__ {'typestr': '=f8'} sage: numpy.array(f) array(0.500000000000000, dtype=object)
This seems to be do to the changes in PyArray_FromAny introduced in https://github.com/mwhansen/numpy/commit/2635398db3f26529ce2aaea4028a8118844... . In particular, _array_find_type used to be used to query our __array_interface__ attribute, and it no longer seems to work. Is there a way to get the old behavior with the current code?
No idea. If you want to spend the time to fix this properly, you could implement PEP 3118 and use that instead to export your array data (which can be done from Cython using __getbuffer__ on a Cython class).
I don't think that would work, because looking more closely, I don't think they're actually doing anything like what __array_interface__/PEP3118 are designed for. They just have some custom class ("sage.rings.real_mpfr.RealLiteral", I guess an arbitrary precision floating point of some sort?), and they want instances that are passed to np.array() to be automatically coerced to another type (float64) by default. But there's no buffer sharing or anything like that going on at all. Mike, does that sound right?
This automagic coercion seems... in very dubious taste to me. (Why does creating an array object imply that you want to throw away precision? You can already throw away precision explicitly by doing np.array(f, dtype=float).) But if this automatic coercion feature is useful, then wouldn't it be better to have a different interface instead of kluging it into __array_interface__, like we should check for an attribute called __numpy_preferred_dtype__ or something?
n
On Tue, Jun 5, 2012 at 8:34 AM, Nathaniel Smith njs@pobox.com wrote:
I don't think that would work, because looking more closely, I don't think they're actually doing anything like what __array_interface__/PEP3118 are designed for. They just have some custom class ("sage.rings.real_mpfr.RealLiteral", I guess an arbitrary precision floating point of some sort?), and they want instances that are passed to np.array() to be automatically coerced to another type (float64) by default. But there's no buffer sharing or anything like that going on at all. Mike, does that sound right?
Yes, there's no buffer sharing going on at all.
This automagic coercion seems... in very dubious taste to me. (Why does creating an array object imply that you want to throw away precision?
The __array_interface__ attribute is a property which depends on the precision of the ring. If it floats have enough precision, you just get floats; otherwise you get objects.
You can already throw away precision explicitly by doing np.array(f, dtype=float).) But if this automatic coercion feature is useful, then wouldn't it be better to have a different interface instead of kluging it into __array_interface__, like we should check for an attribute called __numpy_preferred_dtype__ or something?
It isn't just the array() calls which end up getting problems. For example, in 1.5.x
sage: f = 10; type(f) <type 'sage.rings.integer.Integer'> sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) #int64
while in 1.6.x
sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=object)
We also see problems with calls like
sage: scipy.stats.uniform(0,15).ppf([0.5,0.7]) array([ 7.5, 10.5])
which work in 1.5.x, but fail with a traceback "TypeError: array cannot be safely cast to required type" in 1.6.x.
Mike
It isn't just the array() calls which end up getting problems. For example, in 1.5.x
sage: f = 10; type(f) <type 'sage.rings.integer.Integer'> sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) #int64
while in 1.6.x
sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=object)
We also see problems with calls like
sage: scipy.stats.uniform(0,15).ppf([0.5,0.7]) array([ 7.5, 10.5])
which work in 1.5.x, but fail with a traceback "TypeError: array cannot be safely cast to required type" in 1.6.x.
I'm getting problems like this after a 1.6 upgrade as well. Lots of object arrays being created when previously there would either be an error, or an array of floats.
Also, lots of the "TypeError: array cannot be safely cast to required type" are cropping up.
Honestly, most of these are in places where my code was lax and so I just cleaned things up to use the right dtypes etc. But still a bit unexpected in terms of having more code to fix than I was used to for 0.X numpy revisions.
Just another datapoint, though. Not really a complaint.
Zach
On Tue, Jun 5, 2012 at 11:51 AM, Zachary Pincus zachary.pincus@yale.eduwrote:
It isn't just the array() calls which end up getting problems. For example, in 1.5.x
sage: f = 10; type(f) <type 'sage.rings.integer.Integer'> sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) #int64
while in 1.6.x
sage: numpy.arange(f) array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=object)
We also see problems with calls like
sage: scipy.stats.uniform(0,15).ppf([0.5,0.7]) array([ 7.5, 10.5])
which work in 1.5.x, but fail with a traceback "TypeError: array cannot be safely cast to required type" in 1.6.x.
I'm getting problems like this after a 1.6 upgrade as well. Lots of object arrays being created when previously there would either be an error, or an array of floats.
Also, lots of the "TypeError: array cannot be safely cast to required type" are cropping up.
Honestly, most of these are in places where my code was lax and so I just cleaned things up to use the right dtypes etc. But still a bit unexpected in terms of having more code to fix than I was used to for 0.X numpy revisions.
There is a fine line here. We do need to make people clean up lax code in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Chuck
There is a fine line here. We do need to make people clean up lax code in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I keep tabs on the numpy lists  at no point did it become clear that "big changes in how arrays get constructed and typecast are ahead that may require code fixes". That was my main point, but probably a PEBCAK issue more than anything.
Zach
On Tue, Jun 5, 2012 at 8:41 PM, Zachary Pincus zachary.pincus@yale.eduwrote:
There is a fine line here. We do need to make people clean up lax code
in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I keep tabs on the numpy lists  at no point did it become clear that "big changes in how arrays get constructed and typecast are ahead that may require code fixes". That was my main point, but probably a PEBCAK issue more than anything.
It was fairly extensively discussed when introduced, http://thread.gmane.org/gmane.comp.python.numeric.general/44206, and again at some later point.
Ralf
On Tue, Jun 5, 2012 at 7:47 PM, Ralf Gommers ralf.gommers@googlemail.com wrote:
On Tue, Jun 5, 2012 at 8:41 PM, Zachary Pincus zachary.pincus@yale.edu wrote:
There is a fine line here. We do need to make people clean up lax code in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I keep tabs on the numpy lists  at no point did it become clear that "big changes in how arrays get constructed and typecast are ahead that may require code fixes". That was my main point, but probably a PEBCAK issue more than anything.
It was fairly extensively discussed when introduced, http://thread.gmane.org/gmane.comp.python.numeric.general/44206, and again at some later point.
Those are the notyetfinalized changes in 1.7; Zachary (I think) is talking about problems upgrading from ~1.5 to 1.6.
n
On Tue, Jun 5, 2012 at 8:41 PM, Zachary Pincus zachary.pincus@yale.edu wrote:
There is a fine line here. We do need to make people clean up lax code in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I keep tabs on the numpy lists  at no point did it become clear that "big changes in how arrays get constructed and typecast are ahead that may require code fixes". That was my main point, but probably a PEBCAK issue more than anything.
It was fairly extensively discussed when introduced, http://thread.gmane.org/gmane.comp.python.numeric.general/44206, and again at some later point.
Those are the notyetfinalized changes in 1.7; Zachary (I think) is talking about problems upgrading from ~1.5 to 1.6.
Yes, unless I'm wrong I experienced these problems from 1.5.something to 1.6.1. I didn't take notes as it was in the middle of a deadlinecrunch so I just fixed the code and moved on (long, stupid story about why the upgrade before a deadline...). It's just that the issues mentioned above seem to have hit me too and I wanted to mention that. But unhelpfully, I think, without code, and now I've hijacked this thread! Sorry.
Zach
During the original discussion, Gael pointed out that the changes would probably break some code (which might need to be cleaned up but still). I think it was underestimated how quickly people would upgrade and see the changes and therefore be able to report problems.
We are talking about a 1.7 release, but there are still people who have not upgraded their code to use 1.6 (when some of the big changes occurred).
This should probably guide our view of how long it takes to migrate behavior in NumPy and minimize migration difficulties for users.
Travis
On Jun 5, 2012, at 2:01 PM, Zachary Pincus wrote:
On Tue, Jun 5, 2012 at 8:41 PM, Zachary Pincus zachary.pincus@yale.edu wrote:
There is a fine line here. We do need to make people clean up lax code in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I keep tabs on the numpy lists  at no point did it become clear that "big changes in how arrays get constructed and typecast are ahead that may require code fixes". That was my main point, but probably a PEBCAK issue more than anything.
It was fairly extensively discussed when introduced, http://thread.gmane.org/gmane.comp.python.numeric.general/44206, and again at some later point.
Those are the notyetfinalized changes in 1.7; Zachary (I think) is talking about problems upgrading from ~1.5 to 1.6.
Yes, unless I'm wrong I experienced these problems from 1.5.something to 1.6.1. I didn't take notes as it was in the middle of a deadlinecrunch so I just fixed the code and moved on (long, stupid story about why the upgrade before a deadline...). It's just that the issues mentioned above seem to have hit me too and I wanted to mention that. But unhelpfully, I think, without code, and now I've hijacked this thread! Sorry.
Zach _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
I don't think that would work, because looking more closely, I don't think they're actually doing anything like what __array_interface__/PEP3118 are designed for. They just have some custom class ("sage.rings.real_mpfr.RealLiteral", I guess an arbitrary precision floating point of some sort?), and they want instances that are passed to np.array() to be automatically coerced to another type (float64) by default. But there's no buffer sharing or anything like that going on at all. Mike, does that sound right?
This automagic coercion seems... in very dubious taste to me. (Why does creating an array object imply that you want to throw away precision? You can already throw away precision explicitly by doing np.array(f, dtype=float).) But if this automatic coercion feature is useful, then wouldn't it be better to have a different interface instead of kluging it into __array_interface__, like we should check for an attribute called __numpy_preferred_dtype__ or something?
Interesting. It does look like offlabel use of the __array_interface__ attribute. Given that "array" used to query the __array_interface__ attribute for type discovery, I still wonder why it was disabled in 1.6?
Travis
On Wed, Jun 6, 2012 at 5:11 AM, Travis Oliphant travis@continuum.io wrote:
During the original discussion, Gael pointed out that the changes would probably break some code (which might need to be cleaned up but still). I think it was underestimated how quickly people would upgrade and see the changes and therefore be able to report problems.
You're making the same mistake I made above. This error occurs in 1.6.x,
so before the proposed change to casting='same_kind'.
That's not actually the default right now by the way, in both 1.6.2 and current master the default is 'safe'.
In [3]: np.__version__ Out[3]: '1.7.0.devfd78546'
In [4]: print np.can_cast.__doc__ can_cast(from, totype, casting = 'safe')
Ralf
We are talking about a 1.7 release, but there are still people who have not
upgraded their code to use 1.6 (when some of the big changes occurred).
This should probably guide our view of how long it takes to migrate behavior in NumPy and minimize migration difficulties for users.
Travis
On Jun 5, 2012, at 2:01 PM, Zachary Pincus wrote:
On Tue, Jun 5, 2012 at 8:41 PM, Zachary Pincus <
zachary.pincus@yale.edu>
wrote:
There is a fine line here. We do need to make people clean up lax
code
in order to improve numpy, but hopefully we can keep the cleanups reasonable.
Oh agreed. Somehow, though, I was surprised by this, even though I
keep
tabs on the numpy lists  at no point did it become clear that "big
changes
in how arrays get constructed and typecast are ahead that may require
code
fixes". That was my main point, but probably a PEBCAK issue more than anything.
It was fairly extensively discussed when introduced, http://thread.gmane.org/gmane.comp.python.numeric.general/44206, and
again
at some later point.
Those are the notyetfinalized changes in 1.7; Zachary (I think) is talking about problems upgrading from ~1.5 to 1.6.
Yes, unless I'm wrong I experienced these problems from 1.5.something to
1.6.1. I didn't take notes as it was in the middle of a deadlinecrunch so I just fixed the code and moved on (long, stupid story about why the upgrade before a deadline...). It's just that the issues mentioned above seem to have hit me too and I wanted to mention that. But unhelpfully, I think, without code, and now I've hijacked this thread! Sorry.
Zach _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
participants (7)

Charles R Harris

Dag Sverre Seljebotn

Mike Hansen

Nathaniel Smith

Ralf Gommers

Travis Oliphant

Zachary Pincus