arr.names should have been arr.dtype.names in that pack_last_axis function

Eric



On Fri, 26 Jan 2018 at 12:45 Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Jan 26, 2018 at 10:48 AM, Allan Haldane <allanhaldane@gmail.com> wrote:
> What do folks think about a totuple() method — even before this I’ve
> wanted that. But in this case, it seems particularly useful.
 
Two thoughts:

1. `totuple` makes most sense for 2d arrays. But what should it do for
1d or 3+d arrays? I suppose it could make the last dimension a tuple, so
1d arrays would give a list of tuples of size 1.

I was thinking it would be exactly like .tolist() but with tuples -- so you'd get tuples all the way down (or is that turtles?)

IN this use case, it would have saved me the generator expression:

(tuple(r) for r in arr)

not a huge deal, but it would be nice to not  have to write that, and to have the looping be in C with no intermediate array generation.

2. structured array's .tolist() already returns a list of tuples. If we
have a 2d structured array, would it add one more layer of tuples?

no -- why? it would return a tuple of tuples instead.
 
That
would raise an exception if read back in by `np.array` with the same dtype.

Hmm -- indeed, if the top-level structure is a tuple, the array constructor gets confused:

This works fine -- as it should:


In [84]: new_full = np.array(full.tolist(), full.dtype)


But this does not:


In [85]: new_full = np.array(tuple(full.tolist()), full.dtype)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-85-c305063184ff> in <module>()

----> 1 new_full = np.array(tuple(full.tolist()), full.dtype)


ValueError: could not assign tuple of length 4 to structure with 2 fields.


I was hoping it would dig down to the inner structures looking for a match to the dtype, rather than looking at the type of the top level. Oh well.

So yeah, not sure where you would go from tuple to list -- probably at the bottom level, but that may not always be unambiguous.

These points make me think that instead of a `.totuple` method, this
might be more suitable as a new function in np.lib.recfunctions.

I don't seem to have that module -- and I'm running 1.14.0 -- is this a new idea?
 
If the
goal is to help manipulate structured arrays, that submodule is
appropriate since it already has other functions do manipulate fields in
similar ways. What about calling it `pack_last_axis`?

def pack_last_axis(arr, names=None):
    if arr.names:
        return arr
    names = names or ['f{}'.format(i) for i in range(arr.shape[-1])]
    return arr.view([(n, arr.dtype) for n in names]).squeeze(-1)

Then you could do:

    >>> pack_last_axis(uv).tolist()

to get a list of tuples.

not sure what idea is here -- in my example, I had a regular 2-d array, so no names:

In [90]: pack_last_axis(uv)

---------------------------------------------------------------------------

AttributeError                            Traceback (most recent call last)

<ipython-input-90-a75ee44c8401> in <module>()

----> 1 pack_last_axis(uv)


<ipython-input-89-cfbc76779d1f> in pack_last_axis(arr, names)

      1 def pack_last_axis(arr, names=None):

----> 2     if arr.names:

      3         return arr

      4     names = names or ['f{}'.format(i) for i in range(arr.shape[-1])]

      5     return arr.view([(n, arr.dtype) for n in names]).squeeze(-1)


AttributeError: 'numpy.ndarray' object has no attribute 'names'


So maybe you meants something like:


In [95]: def pack_last_axis(arr, names=None):

    ...:     try:

    ...:         arr.names

    ...:         return arr

    ...:     except AttributeError:

    ...:         names = names or ['f{}'.format(i) for i in range(arr.shape[-1])]

    ...:         return arr.view([(n, arr.dtype) for n in names]).squeeze(-1)


which does work, but seems like a convoluted way to get tuples!

However, I didn't actually need tuples, I needed something I could pack into a stuctarray, and this does work, without the tolist:

full = np.array(zip(time, pack_last_axis(uv)), dtype=dt)


So maybe that is the way to go.

I'm not sure I'd have thought to look for this function, but what can you do?

Thanks for your attention to this,

-CHB

--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker@noaa.gov
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion