Re: [Numpy-discussion] Setting custom dtypes and 1.14

25 Jan 2018

      On 01/25/2018 06:06 PM, Chris Barker wrote:
...
Hi all,
I'm pretty sure this is the same thing as recently discussed on this
list about 1.14, but to confirm:
I had failures in my code with an upgrade for 1.14 -- turns out it was a
single line in a single test fixture, so no big deal, but a regression
just the same, with no deprecation warning.
I was essentially doing this:
In [*48*]: dt
Out[*48*]: dtype([('time', '<i8'), ('value', [('u', '<f8'), ('v',
'<f8')])], align=True)
In [*49*]: uv
Out[*49*]: 
array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])
In [*50*]: time
Out[*50*]: array([1, 1, 1, 1])
In [*51*]: full = np.array(zip(time, uv), dtype=dt)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-51-ed726f71dd4a>in <module>()
----> 1full =np.array(zip(time,uv),dtype=dt)
ValueError: setting an array element with a sequence.
It took some poking, but the solution was to do:
full = np.array(zip(time, (tuple(w) *for*w *in*uv)), dtype=dt)
That is, convert the values to nested tuples, rather than an array in a
tuple, or a list in a tuple.
As I said, my problem is solved, but to confirm:
1) This is a known change with good reason?
This change is a little different from what we discussed before. The
change occurred because the old assignment behavior was dangerous, and
was not doing what you thought. If you modify your dtype above changing
both 'f8' fields to 'f4', you will see you get very strange results:
Your array gets filled in with the values
(1, ( 0.,  1.875)).

Here's what happened: Previously, numpy was *not* iterating your data as
a sequence. Instead, if numpy did not find a tuple it would interpret
the data a a raw buffer and copy the value byte-by-byte, ignoring
endianness, casting, stride, etc. You can get even weirder results if
you do `uv = uv.astype('i4')`, for example.

It happened to work for you because ndarrays expose a buffer interface,
and you were assigning using exactly the same type and endianness.

In 1.14 the fix was to disallow this 'buffer' assignment for structured
arrays, it was causing quite confusing bugs. Unstructured "void" arrays
still do this though.
...
2) My solution was the best (only) one -- the only way to set a nested
dtype like that is with tuples?
Right, our solution was to only allow assignment from tuples.

We might be able to relax that for structured scalars, but for arrays I
remember one consideration was to avoid confusion with array
broadcasting: If you do

    >>> x = np.zeros(2, dtype='i4,i4')
    >>> x[:] = np.array([3, 4])
    >>> x
    array([(3, 3), (4, 4)], dtype=[('f0', '<i4'), ('f1', '<i4')])

it might be the opposite of what you expect. Compare to

    >>> x[:] = (3, 4)
    >>> x
    array([(3, 4), (3, 4)], dtype=[('f0', '<i4'), ('f1', '<i4')])
...
If so, then I think we should:
A) improve the error message.
"ValueError: setting an array element with a sequence."
Is not really clear -- I spent a while trying to figure out how I could
set a nested dtype like that without a sequence? and I was actually
using a ndarray, so it wasn't even a generic sequence. And a tuple is a
sequence, too...
I had a vague recollection that in some circumstances, numpy treats
tuples and lists (and arrays) differently (fancy indexing??), so I tried
the tuple thing and that worked. But I've been around numpy a long time
-- that could have been very very confusing to many people.
So could the message be changed to something like:
"ValueError: setting an array element with a generic sequence. Only the
tuple type can be used in this context."
or something like that -- I'm not sure where else this same error
message might pop up, so that could be totally inappropriate.
Good idea. I'll see if we can do it for 1.14.1.
...
2) maybe add a .totuple()method to ndarray, much like the .tolist()
method? that would have been handy here.
...
-Chris
--
Christopher Barker, Ph.D.
Oceanographer
Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959 <tel:%28206%29%20526-6959>   voice
7600 Sand Point Way NE   (206) 526-6329 <tel:%28206%29%20526-6329>   fax
Seattle, WA  98115       (206) 526-6317 <tel:%28206%29%20526-6317>  
main reception
Chris.Barker@noaa.gov <mailto:Chris.Barker@noaa.gov>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Setting custom dtypes and 1.14

Allan Haldane