Cython 0.16 and ndarray fields deprecation
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is. The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy. It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases. The simplest change to make would be to simply remove all the ndarray fields from numpy.pxd, and inform about the alternatives in the release notes. That could be done in time for 0.16. The alternatives of sounding sane deprecation warnings in just the right places takes more work. I can't work on that myself until PyCon sprints... perhaps put out 0.16.1 with just this change after PyCon sprints though... Dag
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
The simplest change to make would be to simply remove all the ndarray fields from numpy.pxd, and inform about the alternatives in the release notes. That could be done in time for 0.16.
The alternatives of sounding sane deprecation warnings in just the right places takes more work. I can't work on that myself until PyCon sprints... perhaps put out 0.16.1 with just this change after PyCon sprints though...
Personally, I don't think this is time critical. Stefan
On 02/29/2012 09:42 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
Well, the problem is that shape is currently declared as a C field. It is also available as a Python attribute. Usually the user doesn't care which one is used, but the C field is declared for the few cases where access is speed-critical. Though even with current NumPy, I find myself doing "print (<object>arr).shape" in order to get a tuple rather than a Py_ssize_t*...
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
It previously worked, but it turns out that it was always frowned-upon. I didn't know that when I added the fields, and it was a convenient way of speeding things up... Dag
On 29 February 2012 17:57, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 02/29/2012 09:42 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
Well, the problem is that shape is currently declared as a C field. It is also available as a Python attribute. Usually the user doesn't care which one is used, but the C field is declared for the few cases where access is speed-critical.
Though even with current NumPy, I find myself doing "print (<object>arr).shape" in order to get a tuple rather than a Py_ssize_t*...
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
It previously worked, but it turns out that it was always frowned-upon. I didn't know that when I added the fields, and it was a convenient way of speeding things up...
I would personally prefer either cdef nogil extension class properties (needs compiler support) or just special-casing in the compiler, which shouldn't be too hard I think. Warnings would be a first step, but the linkage of ndarray attributes to C attributes is really an implementation detail, so it's better to keep supporting the attributes correctly.
Dag
_______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
On 03/01/2012 04:03 AM, mark florisson wrote:
On 29 February 2012 17:57, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 02/29/2012 09:42 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
Well, the problem is that shape is currently declared as a C field. It is also available as a Python attribute. Usually the user doesn't care which one is used, but the C field is declared for the few cases where access is speed-critical.
Though even with current NumPy, I find myself doing "print (<object>arr).shape" in order to get a tuple rather than a Py_ssize_t*...
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
It previously worked, but it turns out that it was always frowned-upon. I didn't know that when I added the fields, and it was a convenient way of speeding things up...
I would personally prefer either cdef nogil extension class properties (needs compiler support) or just special-casing in the compiler, which shouldn't be too hard I think. Warnings would be a first step, but the linkage of ndarray attributes to C attributes is really an implementation detail, so it's better to keep supporting the attributes correctly.
So you are saying we (somehow) stick with supporting "arr.shape[0]" in the future, and perhaps even support "print arr.shape"? (+ arr.dim, arr.strides). Exactly how we could figure out at PyCon. I'm anyway leaning towards deprecating arr.data, as it's too different from what the Python attribute does. Reason I'm asking is that I'm giving a talk on Saturday, and I don't want to teach people bad habits -- so we must figure out what the bad habits are :-) (I think this applies for the PyCon poster as well...) [1] PyData workshop at Google's offices in Mountain View; the event was open for all but now it is full with a long waiting list, which is why I didn't announce it. http://pydataworkshop.eventbrite.com/ Dag
On 01.03.2012 17:18, Dag Sverre Seljebotn wrote:
I'm anyway leaning towards deprecating arr.data, as it's too different from what the Python attribute does.
This should be preferred, I think &arr[0] or <char*> &arr[0] The latter is exacty what arr.data will currently do in Cython (but not in Python). But there is code in SciPy that depends on the arr.data attribute in Cython, such as cKDTree. Sturla
On 01.03.2012 17:18, Dag Sverre Seljebotn wrote:
are saying we (somehow) stick with supporting "arr.shape[0]" in the future, and perhaps even support "print arr.shape"? (+ arr.dim, arr.strides).
What if you just deprecate ndarray support completely, and just focus on memory views? Yes, you will break all Cython code in the world depending on ndarrays. But you will do that anyway by tempering with the interface. And as changes to the NumPy C API mandates a change to the interface, I see no reason to keep it. If you are going to break all code, then just do it completely. It is worse to stick to syntax bloat. There should not be multiple ways to do the same (like the zen of Python). Sturla
On 03/01/2012 09:29 AM, Sturla Molden wrote:
On 01.03.2012 17:18, Dag Sverre Seljebotn wrote:
are saying we (somehow) stick with supporting "arr.shape[0]" in the future, and perhaps even support "print arr.shape"? (+ arr.dim, arr.strides).
What if you just deprecate ndarray support completely, and just focus on memory views?
Yes, you will break all Cython code in the world depending on ndarrays. But you will do that anyway by tempering with the interface. And as changes to the NumPy C API mandates a change to the interface, I see no reason to keep it. If you are going to break all code, then just do it completely. It is worse to stick to syntax bloat. There should not be multiple ways to do the same (like the zen of Python).
Yeah, I proposed this on another thread as one of the options, but the support wasn't overwhelming at the time... About the scipy kcdTree issue, the SciPy process of generating Cython code manually when the code is written makes the problem slightly smaller... Dag
On 01.03.2012 19:33, Dag Sverre Seljebotn wrote:
Yeah, I proposed this on another thread as one of the options, but the support wasn't overwhelming at the time...
I think it is worse to break parts of it, thus introducing bugs that might go silent for a long time. Rather deprecate the whole ndarray interface. Sturla
On 1 March 2012 19:16, Sturla Molden <sturla@molden.no> wrote:
On 01.03.2012 19:33, Dag Sverre Seljebotn wrote:
Yeah, I proposed this on another thread as one of the options, but the support wasn't overwhelming at the time...
I think it is worse to break parts of it, thus introducing bugs that might go silent for a long time.
The point is that we would remain fully backwards compatible (except for the data attribute perhaps) and support the attributes portably across numpy versions, nothing would be broken or silent.
Rather deprecate the whole ndarray interface.
As much as I would like that, you can't just break everyone's code in a new release. I think the syntax should be removed several versions later, or maybe even wait until Cython 1.0.
Sturla
_______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
On 1 March 2012 16:18, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 03/01/2012 04:03 AM, mark florisson wrote:
On 29 February 2012 17:57, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 02/29/2012 09:42 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
Well, the problem is that shape is currently declared as a C field. It is also available as a Python attribute. Usually the user doesn't care which one is used, but the C field is declared for the few cases where access is speed-critical.
Though even with current NumPy, I find myself doing "print (<object>arr).shape" in order to get a tuple rather than a Py_ssize_t*...
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
It previously worked, but it turns out that it was always frowned-upon. I didn't know that when I added the fields, and it was a convenient way of speeding things up...
I would personally prefer either cdef nogil extension class properties (needs compiler support) or just special-casing in the compiler, which shouldn't be too hard I think. Warnings would be a first step, but the linkage of ndarray attributes to C attributes is really an implementation detail, so it's better to keep supporting the attributes correctly.
So you are saying we (somehow) stick with supporting "arr.shape[0]" in the future, and perhaps even support "print arr.shape"? (+ arr.dim, arr.strides). Exactly how we could figure out at PyCon.
To remain consistent with previous versions the former should be supported and the latter would be a bonus (and it wouldn't be too hard anyway).
I'm anyway leaning towards deprecating arr.data, as it's too different from what the Python attribute does.
+1 for that, just write &arr[0] as Sturla mentioned. The transition should be trivial.
Reason I'm asking is that I'm giving a talk on Saturday, and I don't want to teach people bad habits -- so we must figure out what the bad habits are :-) (I think this applies for the PyCon poster as well...)
[1] PyData workshop at Google's offices in Mountain View; the event was open for all but now it is full with a long waiting list, which is why I didn't announce it. http://pydataworkshop.eventbrite.com/
Dag _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
On Fri, Mar 2, 2012 at 8:29 AM, mark florisson <markflorisson88@gmail.com> wrote:
On 1 March 2012 16:18, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 03/01/2012 04:03 AM, mark florisson wrote:
On 29 February 2012 17:57, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
On 02/29/2012 09:42 AM, Stefan Behnel wrote:
Dag Sverre Seljebotn, 29.02.2012 18:06:
I'm wondering what the best course of action for deprecating the shape field in numpy.pxd is.
The thing is, currently "shape" really gets in the way. In most situations it is OK with slow access to shape through the Python layer, and "arr.shape[0]" is often just fine, but currently one is in a situation where one must either write "(<object>arr).shape[0])" or "np.PyArray_DIMS(arr)[0]", or be faced with code that isn't forward-compatible with NumPy.
Can Cython emulate this at the C layer? And even your work-around for the Python object access looks more like a Cython bug to me. I wouldn't know why that can't "just work". It usually works for other undeclared Python attributes of "anything", so it might just as well be made to work here.
Well, the problem is that shape is currently declared as a C field. It is also available as a Python attribute. Usually the user doesn't care which one is used, but the C field is declared for the few cases where access is speed-critical.
Though even with current NumPy, I find myself doing "print (<object>arr).shape" in order to get a tuple rather than a Py_ssize_t*...
It would really be good to do the transition as fast as possible, so that all Cython code eventually becomes ready for upcoming NumPy releases.
But it previously worked, right? It's just no longer supported in newer NumPy versions IIUC? If that's the case, deleting it would break otherwise working code. No-one forces you to switch to the latest NumPy version, after all, and certainly not right now. A warning is much better.
It previously worked, but it turns out that it was always frowned-upon. I didn't know that when I added the fields, and it was a convenient way of speeding things up...
I would personally prefer either cdef nogil extension class properties (needs compiler support) or just special-casing in the compiler, which shouldn't be too hard I think. Warnings would be a first step, but the linkage of ndarray attributes to C attributes is really an implementation detail, so it's better to keep supporting the attributes correctly.
So you are saying we (somehow) stick with supporting "arr.shape[0]" in the future, and perhaps even support "print arr.shape"? (+ arr.dim, arr.strides). Exactly how we could figure out at PyCon.
To remain consistent with previous versions the former should be supported and the latter would be a bonus (and it wouldn't be too hard anyway).
I'm anyway leaning towards deprecating arr.data, as it's too different from what the Python attribute does.
+1 for that, just write &arr[0] as Sturla mentioned. The transition should be trivial.
If there's a confusion due to .data already having a certain meaning with the python attribute, perhaps it would make sense to have an attribute with a different name, eg. .ptr or .voidptr ? IMHO writing &arr[0] looks like a workaround of some kind. Like, when in C you had something like a 2d array and you'd need to interpret it as a 1d array you'd write &arr[0][0], but C array syntax doesn't support attributes which you can add here. Unless of course the idea is to make arrays to behave and look exactly like C counterparts. Dimitri.
Reason I'm asking is that I'm giving a talk on Saturday, and I don't want to teach people bad habits -- so we must figure out what the bad habits are :-) (I think this applies for the PyCon poster as well...)
[1] PyData workshop at Google's offices in Mountain View; the event was open for all but now it is full with a long waiting list, which is why I didn't announce it. http://pydataworkshop.eventbrite.com/
Dag _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
participants (5)
-
Dag Sverre Seljebotn -
Dimitri Tcaciuc -
mark florisson -
Stefan Behnel -
Sturla Molden