[Numpy-discussion] Setting custom dtypes and 1.14

Chris Barker chris.barker at noaa.gov
Fri Jan 26 15:38:06 EST 2018


On Fri, Jan 26, 2018 at 10:48 AM, Allan Haldane <allanhaldane at gmail.com>
wrote:

> > What do folks think about a totuple() method — even before this I’ve
> > wanted that. But in this case, it seems particularly useful.
>


> Two thoughts:
>
> 1. `totuple` makes most sense for 2d arrays. But what should it do for
> 1d or 3+d arrays? I suppose it could make the last dimension a tuple, so
> 1d arrays would give a list of tuples of size 1.
>

I was thinking it would be exactly like .tolist() but with tuples -- so
you'd get tuples all the way down (or is that turtles?)

IN this use case, it would have saved me the generator expression:

(tuple(r) for r in arr)

not a huge deal, but it would be nice to not  have to write that, and to
have the looping be in C with no intermediate array generation.

2. structured array's .tolist() already returns a list of tuples. If we
> have a 2d structured array, would it add one more layer of tuples?


no -- why? it would return a tuple of tuples instead.


> That
> would raise an exception if read back in by `np.array` with the same dtype.
>

Hmm -- indeed, if the top-level structure is a tuple, the array constructor
gets confused:

This works fine -- as it should:


In [*84*]: new_full = np.array(full.tolist(), full.dtype)


But this does not:


In [*85*]: new_full = np.array(tuple(full.tolist()), full.dtype)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-85-c305063184ff> in <module>()

----> 1 new_full = np.array(tuple(full.tolist()), full.dtype)


ValueError: could not assign tuple of length 4 to structure with 2 fields.

I was hoping it would dig down to the inner structures looking for a match
to the dtype, rather than looking at the type of the top level. Oh well.

So yeah, not sure where you would go from tuple to list -- probably at the
bottom level, but that may not always be unambiguous.

These points make me think that instead of a `.totuple` method, this
> might be more suitable as a new function in np.lib.recfunctions.


I don't seem to have that module -- and I'm running 1.14.0 -- is this a new
idea?


> If the
> goal is to help manipulate structured arrays, that submodule is
> appropriate since it already has other functions do manipulate fields in
> similar ways. What about calling it `pack_last_axis`?
>
> def pack_last_axis(arr, names=None):
>     if arr.names:
>         return arr
>     names = names or ['f{}'.format(i) for i in range(arr.shape[-1])]
>     return arr.view([(n, arr.dtype) for n in names]).squeeze(-1)
>
> Then you could do:
>
>     >>> pack_last_axis(uv).tolist()
>
> to get a list of tuples.
>

not sure what idea is here -- in my example, I had a regular 2-d array, so
no names:

In [*90*]: pack_last_axis(uv)

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-90-a75ee44c8401> in <module>()

----> 1 pack_last_axis(uv)


<ipython-input-89-cfbc76779d1f> in pack_last_axis(arr, names)

*      1* def pack_last_axis(arr, names=None):

----> 2     if arr.names:

*      3*         return arr

*      4*     names = names or ['f{}'.format(i) for i in range(arr.shape[-1
])]

*      5*     return arr.view([(n, arr.dtype) for n in names]).squeeze(-1)


AttributeError: 'numpy.ndarray' object has no attribute 'names'


So maybe you meants something like:


In [*95*]: *def* pack_last_axis(arr, names=None):

    ...:     *try*:

    ...:         arr.names

    ...:         *return* arr

    ...:     *except* *AttributeError*:

    ...:         names = names *or* ['f{}'.format(i) *for* i *in* range
(arr.shape[-1])]

    ...:         *return* arr.view([(n, arr.dtype) *for* n *in*
names]).squeeze(-1)

which does work, but seems like a convoluted way to get tuples!

However, I didn't actually need tuples, I needed something I could pack
into a stuctarray, and this does work, without the tolist:

full = np.array(zip(time, pack_last_axis(uv)), dtype=dt)


So maybe that is the way to go.

I'm not sure I'd have thought to look for this function, but what can you
do?

Thanks for your attention to this,

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20180126/26c8221e/attachment-0001.html>


More information about the NumPy-Discussion mailing list