Fixing PyArray_Descr flags member size, ABI vs pickling issue
Hi, This is following the discussion on bug http://projects.scipy.org/numpy/ticket/2017 Essentially, there is a discrepency between the actual type of the flags member in the dtype C structure (char) and how it is declared in the descriptor table (T_INT). The problem is that we are damned if we fix it, damned if we are not: - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks pickled numpy arrays - not fixing it means that the value is broken at the python level. Changing the flag to be an actual int would break the ABI. I currently have this: https://github.com/numpy/numpy/pull/228 Which one is the more appropriate ? Maybe a deprecation warning for 1.7 and change it in 1.8 ? Note that we can fix the hash issue with either solution, David
On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape@gmail.com> wrote:
Hi,
This is following the discussion on bug http://projects.scipy.org/numpy/ticket/2017
Essentially, there is a discrepency between the actual type of the flags member in the dtype C structure (char) and how it is declared in the descriptor table (T_INT). The problem is that we are damned if we fix it, damned if we are not: - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks pickled numpy arrays
Is the problem that T_BYTE returns a single-item string? My handwrapping skills are rusty (Cython has blissfully eradicated this from my memory), but aren't these T_INT/T_BYTE things just convenient shortcuts for exposing C struct members as Python attributes? Couldn't we just write a full getter function for returning the correct value, just returned as a Python int instead of a str. -- Robert Kern
On Tue, Mar 6, 2012 at 6:20 AM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape@gmail.com> wrote:
Hi,
This is following the discussion on bug http://projects.scipy.org/numpy/ticket/2017
Essentially, there is a discrepency between the actual type of the flags member in the dtype C structure (char) and how it is declared in the descriptor table (T_INT). The problem is that we are damned if we fix it, damned if we are not: - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks pickled numpy arrays
Is the problem that T_BYTE returns a single-item string?
Yes (although it is actually what we want, instead of an int).
My handwrapping skills are rusty (Cython has blissfully eradicated this from my memory), but aren't these T_INT/T_BYTE things just convenient shortcuts for exposing C struct members as Python attributes? Couldn't we just write a full getter function for returning the correct value, just returned as a Python int instead of a str.
You're right, I did not think about this solution. That's certainly better than the two I suggested. Thanks, David
Why do we want to return a single string char instead of an int? There is a need for more flags on the dtype object. Using an actual attribute call seems like the way to go. This could even merge the contents of two struct members so that we can add more flags but preserve ABI compatibility. Travis -- Travis Oliphant (on a mobile) 512-826-7480 On Mar 6, 2012, at 8:07 AM, David Cournapeau <cournape@gmail.com> wrote:
On Tue, Mar 6, 2012 at 6:20 AM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Mar 6, 2012 at 03:53, David Cournapeau <cournape@gmail.com> wrote:
Hi,
This is following the discussion on bug http://projects.scipy.org/numpy/ticket/2017
Essentially, there is a discrepency between the actual type of the flags member in the dtype C structure (char) and how it is declared in the descriptor table (T_INT). The problem is that we are damned if we fix it, damned if we are not: - fixing T_INT to T_BYTE. flag in python is now fixed, but it breaks pickled numpy arrays
Is the problem that T_BYTE returns a single-item string?
Yes (although it is actually what we want, instead of an int).
My handwrapping skills are rusty (Cython has blissfully eradicated this from my memory), but aren't these T_INT/T_BYTE things just convenient shortcuts for exposing C struct members as Python attributes? Couldn't we just write a full getter function for returning the correct value, just returned as a Python int instead of a str.
You're right, I did not think about this solution. That's certainly better than the two I suggested.
Thanks,
David _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Mar 6, 2012 at 18:25, Travis Oliphant <travis@continuum.io> wrote:
Why do we want to return a single string char instead of an int?
I suspect just to ensure that any provided value fits in the range 0..255. But that's easily done explicitly. -- Robert Kern
On Tue, Mar 6, 2012 at 1:44 PM, Robert Kern <robert.kern@gmail.com> wrote:
On Tue, Mar 6, 2012 at 18:25, Travis Oliphant <travis@continuum.io> wrote:
Why do we want to return a single string char instead of an int?
I suspect just to ensure that any provided value fits in the range 0..255. But that's easily done explicitly.
That was not even the issue in the end, my initial analysis was wrong. In any case, I have now a new PR that fixes both dtypes.flags value and dtype hashing reported in #2017: https://github.com/numpy/numpy/pull/231 regards, David
On Tue, Mar 6, 2012 at 1:25 PM, Travis Oliphant <travis@continuum.io> wrote:
Why do we want to return a single string char instead of an int?
There is a need for more flags on the dtype object. Using an actual attribute call seems like the way to go. This could even merge the contents of two struct members so that we can add more flags but preserve ABI compatibility.
Yes. The T_BYTE/T_INT is actually pretty minor compared to the underlying issue (where we cast back and forth between int and char). I will make a new PR that fixes everything but this exact point, and will put an actual accessor if needed. Given that dtype.flags is nonsensical as of today (at the python level), I would expect nobody uses it. cheers, David
participants (3)
-
David Cournapeau -
Robert Kern -
Travis Oliphant