Hi, I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change. The problem is that we can't access fields from PyArrayObject anymore, we absolutely must use the old macro/newly function. For the data field, the new function don't allow to set it. There is no function that allow to do this. After so much time spent on small syntactic change, I don't feel making more complex change today. Also, I think there should be a function PyArray_SetDataPtr as similar to PyArray_SetBaseObject. Do you plan to add one? I though that you wanted to force the removing of the old API, but I never hear you wanted to disable this. The only current work around is to create a new Array, but then we need to update the refcount and other stuff. This will make the code slower too. I'll make another post on the problem I got when making Theano work with numpy 1.7b1, but I want to make this functionality removing problem into its own thread. So, do you plan to add an PyArray_SetDataPtr? Do you agree we should have one? Fred
On Wed, Sep 5, 2012 at 6:36 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change.
My understanding was that 1.7 is not supposed to require any code changes... so, separate from your actual question about assigning to the data field can I ask: are the changes you're talking about just to avoid *deprecated* APIs, or did you have actual problems running Theano against 1.7b1? And if you had actual problems, could you say what? (Or just post a diff of the changes you found you had to make, which should amount to the same thing?) -n
Hi Frederic, On Wed, Sep 5, 2012 at 6:36 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change.
The problem is that we can't access fields from PyArrayObject anymore, we absolutely must use the old macro/newly function.
Why can't you adress the PyArrayObject anymore ? It is deprecated, but the structure itself has not changed. It would certainly be a significant issue if that is not possible anymore, as it would be a significant API break.
For the data field, the new function don't allow to set it. There is no function that allow to do this. After so much time spent on small syntactic change, I don't feel making more complex change today.
Also, I think there should be a function PyArray_SetDataPtr as similar to PyArray_SetBaseObject.
Do you plan to add one? I though that you wanted to force the removing of the old API, but I never hear you wanted to disable this.
It was a design mistake to leak this in the first place, so the end goal (not for 1.7), is certainly to 'forbid' access. It is necessary to move numpy forward and keep ABI compatibility later on. Adding functions to directly access the underlying structures would defeat a lot of this. Regarding the need for new API: - speed issues: do you have any concrete measurements (or usecases) where this is problematic ? - updating the refcount: can you give an example ? thanks, David
Hi Fred, On Wed, Sep 5, 2012 at 10:56 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Sep 5, 2012 at 6:36 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change.
My understanding was that 1.7 is not supposed to require any code changes... so, separate from your actual question about assigning to the data field can I ask: are the changes you're talking about just to avoid *deprecated* APIs, or did you have actual problems running Theano against 1.7b1? And if you had actual problems, could you say what? (Or just post a diff of the changes you found you had to make, which should amount to the same thing?)
Thank you for trying the beta version, that was the purpose to put it out there and see if it breaks things. As others said, if you can give us more details, that'd be great. Let's get it fixed before rc1. Ondrej
Hi, I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information. Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION. I didn't saw it before my first email, but some code I didn't wrote was passing the -Werror flag to g++, so this created errors for me. At first, I added this macro, but this caused error because we used the old C-API. I'm gatering all the correct information/comments and reply later with them. Fred On Wed, Sep 5, 2012 at 6:03 PM, Ondřej Čertík <ondrej.certik@gmail.com> wrote:
Hi Fred,
On Wed, Sep 5, 2012 at 10:56 AM, Nathaniel Smith <njs@pobox.com> wrote:
On Wed, Sep 5, 2012 at 6:36 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change.
My understanding was that 1.7 is not supposed to require any code changes... so, separate from your actual question about assigning to the data field can I ask: are the changes you're talking about just to avoid *deprecated* APIs, or did you have actual problems running Theano against 1.7b1? And if you had actual problems, could you say what? (Or just post a diff of the changes you found you had to make, which should amount to the same thing?)
Thank you for trying the beta version, that was the purpose to put it out there and see if it breaks things.
As others said, if you can give us more details, that'd be great. Let's get it fixed before rc1.
Ondrej _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Thu, Sep 6, 2012 at 10:07 AM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information.
Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION.
You don't want to define this macro if you need to directly access the fields. What warning are you getting if you don't define it? Are you using Cython? <snip> Chuck
Hi, On Wed, Sep 5, 2012 at 7:05 PM, David Cournapeau <cournape@gmail.com> wrote:
Hi Frederic,
On Wed, Sep 5, 2012 at 6:36 PM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I spent up to now 2 or 3 days making change to Theano to support numpy 1.7b1. But now, I just find an interface change that will need recoding a function, not just small code change.
The problem is that we can't access fields from PyArrayObject anymore, we absolutely must use the old macro/newly function.
Why can't you adress the PyArrayObject anymore ? It is deprecated, but the structure itself has not changed. It would certainly be a significant issue if that is not possible anymore, as it would be a significant API break.
For the data field, the new function don't allow to set it. There is no function that allow to do this. After so much time spent on small syntactic change, I don't feel making more complex change today.
Also, I think there should be a function PyArray_SetDataPtr as similar to PyArray_SetBaseObject.
Do you plan to add one? I though that you wanted to force the removing of the old API, but I never hear you wanted to disable this.
It was a design mistake to leak this in the first place, so the end goal (not for 1.7), is certainly to 'forbid' access. It is necessary to move numpy forward and keep ABI compatibility later on.
Is this still the goal? Is there a still a role for a simple numpy array structure for maximum speed of access? Best, Matthew
Hi, On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Sep 6, 2012 at 10:07 AM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information.
Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION.
You don't want to define this macro if you need to directly access the fields. What warning are you getting if you don't define it? Are you using Cython?
If I don't define it and I remove the -Werror, I got 3 errors. 1 is related to an error message that was changed. The second was that we called numpy.dot() with 2 sparse matrix(from scipy). It worked in the past, but not now. Changing the test is easy. I don't expect people to have done this frequently, but maybe warning about this in the release note would help people to fix it faster. The error message is not helpful, it tell that it can't find a common dtype between float32 and float32 dtype. I changed the np.dot(a,b) to a*b as this is the matrix multiplication function for sparse matrix in scipy. This change remove the possibility to make a function that use matrix product to work with both ndarray and sparse matrix without special case for the object type. Not great, but there is an easy work around. So this stay like this in the release, there should be a warning. The third is releated to change to the casting rules in numpy. Before a scalar complex128 * vector float32 gived a vector of dtype complex128. Now it give a vector of complex64. The reason is that now the scalar of different category only change the category, not the precision. I would consider a must that we warn clearly about this interface change. Most people won't see it, but people that optimize there code heavily could depend on such thing. The other problem I had was related to the fact that I tryed to use only the new API. This took me a few day and it is not finished, as now I have a seg fault that is not easy to trigger. It happen in one tests, but only when other tests a ran before... This is probably an error from my change The sed script that replace some macro helped, but there is few macro change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and NPY_F_CONTIGUOUS. The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe warning somewhere in the API transition doc that if people want to stay compatible with older version of numpy, the should use an "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code. I won't have the time to make a PR with those small change as I have a deadline the 16 september and the 1 october. I hope my comment will be helpful. If you still have questions, don't hesitate. Fred
On Sun, Sep 9, 2012 at 11:12 AM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Sep 6, 2012 at 10:07 AM, Frédéric Bastien <nouiz@nouiz.org>
wrote:
Hi,
I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information.
Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION.
You don't want to define this macro if you need to directly access the fields. What warning are you getting if you don't define it? Are you using Cython?
If I don't define it and I remove the -Werror, I got 3 errors. 1 is related to an error message that was changed.
The second was that we called numpy.dot() with 2 sparse matrix(from scipy). It worked in the past, but not now. Changing the test is easy. I don't expect people to have done this frequently, but maybe warning about this in the release note would help people to fix it faster. The error message is not helpful, it tell that it can't find a common dtype between float32 and float32 dtype. I changed the np.dot(a,b) to a*b as this is the matrix multiplication function for sparse matrix in scipy. This change remove the possibility to make a function that use matrix product to work with both ndarray and sparse matrix without special case for the object type. Not great, but there is an easy work around. So this stay like this in the release, there should be a warning.
The third is releated to change to the casting rules in numpy. Before a scalar complex128 * vector float32 gived a vector of dtype complex128. Now it give a vector of complex64. The reason is that now the scalar of different category only change the category, not the precision. I would consider a must that we warn clearly about this interface change. Most people won't see it, but people that optimize there code heavily could depend on such thing.
The other problem I had was related to the fact that I tryed to use only the new API. This took me a few day and it is not finished, as now I have a seg fault that is not easy to trigger. It happen in one tests, but only when other tests a ran before... This is probably an error from my change
The sed script that replace some macro helped, but there is few macro change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and NPY_F_CONTIGUOUS.
I can add those, they seem to have been present since at least Numpy 1.5.
The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe warning somewhere in the API transition doc that if people want to stay compatible with older version of numpy, the should use an "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code.
Hmm... Looks like you are right about NPY_ARRAY_ENSURECOPY. An alternative would be to not deprecate it, but an #ifndef would be better for the long term goal of having everyone use newer macros.
I won't have the time to make a PR with those small change as I have a deadline the 16 september and the 1 october. I hope my comment will be helpful. If you still have questions, don't hesitate.
Chuck
On Sun, Sep 9, 2012 at 1:42 PM, Charles R Harris <charlesr.harris@gmail.com>wrote:
On Sun, Sep 9, 2012 at 11:12 AM, Frédéric Bastien <nouiz@nouiz.org> wrote:
Hi,
On Thu, Sep 6, 2012 at 11:32 AM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Thu, Sep 6, 2012 at 10:07 AM, Frédéric Bastien <nouiz@nouiz.org>
wrote:
Hi,
I reply with more information probably later today or tomorrow, but I think i need to finish everything to give you the exact information.
Part of the problem I had was that by default there is a warning that is generated. It tell that to remove this warning we need to set NPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION.
You don't want to define this macro if you need to directly access the fields. What warning are you getting if you don't define it? Are you using Cython?
If I don't define it and I remove the -Werror, I got 3 errors. 1 is related to an error message that was changed.
The second was that we called numpy.dot() with 2 sparse matrix(from scipy). It worked in the past, but not now. Changing the test is easy. I don't expect people to have done this frequently, but maybe warning about this in the release note would help people to fix it faster. The error message is not helpful, it tell that it can't find a common dtype between float32 and float32 dtype. I changed the np.dot(a,b) to a*b as this is the matrix multiplication function for sparse matrix in scipy. This change remove the possibility to make a function that use matrix product to work with both ndarray and sparse matrix without special case for the object type. Not great, but there is an easy work around. So this stay like this in the release, there should be a warning.
The third is releated to change to the casting rules in numpy. Before a scalar complex128 * vector float32 gived a vector of dtype complex128. Now it give a vector of complex64. The reason is that now the scalar of different category only change the category, not the precision. I would consider a must that we warn clearly about this interface change. Most people won't see it, but people that optimize there code heavily could depend on such thing.
The other problem I had was related to the fact that I tryed to use only the new API. This took me a few day and it is not finished, as now I have a seg fault that is not easy to trigger. It happen in one tests, but only when other tests a ran before... This is probably an error from my change
The sed script that replace some macro helped, but there is few macro change that is not in the file: NPY_ALIGNED to NPY_ARRAY_ALIGNED. idem for NPY_WRITABLE, NPY_UPDATE_ALL, NPY_C_CONTIGUOUS and NPY_F_CONTIGUOUS.
I can add those, they seem to have been present since at least Numpy 1.5.
The sed script change NPY_ENSURECOPY to NPY_ARRAY_ENSURECOPY, but I think that NPY_ARRAY_ENSURECOPY was introduced in numpy 1.7. Maybe warning somewhere in the API transition doc that if people want to stay compatible with older version of numpy, the should use an "#ifndef NPY_ARRAY_ENSURECOPY ..." in there code.
Hmm... Looks like you are right about NPY_ARRAY_ENSURECOPY. An alternative would be to not deprecate it, but an #ifndef would be better for the long term goal of having everyone use newer macros.
And the other *_ARRAY_* macros seem to have been defined in 1.5. If 1.7 is intended to be a long term release, they probably shouldn't be deprecated until a later release.
I won't have the time to make a PR with those small change as I have a deadline the 16 september and the 1 october. I hope my comment will be helpful. If you still have questions, don't hesitate.
Chuck
participants (6)
-
Charles R Harris
-
David Cournapeau
-
Frédéric Bastien
-
Matthew Brett
-
Nathaniel Smith
-
Ondřej Čertík