On 01/25/2018 06:06 PM, Chris Barker wrote:
I'm pretty sure this is the same thing as recently discussed on this list about 1.14, but to confirm:
I had failures in my code with an upgrade for 1.14 -- turns out it was a single line in a single test fixture, so no big deal, but a regression just the same, with no deprecation warning.
I was essentially doing this:
In [*48*]: dt
Out[*48*]: dtype([('time', '<i8'), ('value', [('u', '<f8'), ('v', '<f8')])], align=True)
In [*49*]: uv
In [*50*]: time
Out[*50*]: array([1, 1, 1, 1])
In [*51*]: full = np.array(zip(time, uv), dtype=dt)
ValueError Traceback (most recent call last)
----> 1full =np.array(zip(time,uv),dtype=dt)
ValueError: setting an array element with a sequence.
It took some poking, but the solution was to do:
full = np.array(zip(time, (tuple(w) *for*w *in*uv)), dtype=dt)
That is, convert the values to nested tuples, rather than an array in a tuple, or a list in a tuple.
As I said, my problem is solved, but to confirm:
- This is a known change with good reason?
This change is a little different from what we discussed before. The change occurred because the old assignment behavior was dangerous, and was not doing what you thought. If you modify your dtype above changing both 'f8' fields to 'f4', you will see you get very strange results: Your array gets filled in with the values (1, ( 0., 1.875)).
Here's what happened: Previously, numpy was *not* iterating your data as a sequence. Instead, if numpy did not find a tuple it would interpret the data a a raw buffer and copy the value byte-by-byte, ignoring endianness, casting, stride, etc. You can get even weirder results if you do `uv = uv.astype('i4')`, for example.
It happened to work for you because ndarrays expose a buffer interface, and you were assigning using exactly the same type and endianness.
In 1.14 the fix was to disallow this 'buffer' assignment for structured arrays, it was causing quite confusing bugs. Unstructured "void" arrays still do this though.
- My solution was the best (only) one -- the only way to set a nested
dtype like that is with tuples?
Right, our solution was to only allow assignment from tuples.
We might be able to relax that for structured scalars, but for arrays I remember one consideration was to avoid confusion with array broadcasting: If you do
>>> x = np.zeros(2, dtype='i4,i4') >>> x[:] = np.array([3, 4]) >>> x array([(3, 3), (4, 4)], dtype=[('f0', '<i4'), ('f1', '<i4')])
it might be the opposite of what you expect. Compare to
>>> x[:] = (3, 4) >>> x array([(3, 4), (3, 4)], dtype=[('f0', '<i4'), ('f1', '<i4')])
If so, then I think we should:
A) improve the error message.
"ValueError: setting an array element with a sequence."
Is not really clear -- I spent a while trying to figure out how I could set a nested dtype like that without a sequence? and I was actually using a ndarray, so it wasn't even a generic sequence. And a tuple is a sequence, too...
I had a vague recollection that in some circumstances, numpy treats tuples and lists (and arrays) differently (fancy indexing??), so I tried the tuple thing and that worked. But I've been around numpy a long time -- that could have been very very confusing to many people.
So could the message be changed to something like:
"ValueError: setting an array element with a generic sequence. Only the tuple type can be used in this context."
or something like that -- I'm not sure where else this same error message might pop up, so that could be totally inappropriate.
Good idea. I'll see if we can do it for 1.14.1.
- maybe add a .totuple()method to ndarray, much like the .tolist()
method? that would have been handy here.
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 tel:%28206%29%20526-6959 voice 7600 Sand Point Way NE (206) 526-6329 tel:%28206%29%20526-6329 fax Seattle, WA 98115 (206) 526-6317 tel:%28206%29%20526-6317 main reception
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion