[Numpy-discussion] field names on numpy arrays

Wed Jun 3 20:36:21 EDT 2009

On Wed, Jun 3, 2009 at 8:25 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Jun 3, 2009 at 7:56 PM,  <josef.pktd at gmail.com> wrote:
>> On Wed, Jun 3, 2009 at 7:33 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>>>
>>> On Jun 3, 2009, at 7:23 PM, Robert Kern wrote:
>>>
>>>> On Wed, Jun 3, 2009 at 18:20, Pierre GM <pgmdevlist at gmail.com> wrote:
>>>>>
>>>>>
>>>>> Or, as all fields have the same dtype:
>>>>>
>>>>>  >>> a_array.view(dtype=('f',len(a_array.dtype)))
>>>>> array([[ 0.,  1.,  2.,  3.,  4.],
>>>>>        [ 1.,  2.,  3.,  4.,  5.]], dtype=float32)
>>>>>
>>>>> Ain't it fun ?
>>>>
>>>> Ah, yes, there is that niggle, too.
>>>
>>>
>>>
>>> Except that I always get bitten by that:
>>>
>>>  >>> backandforth =
>>> a_array.view(dtype=('f',len(a_array.dtype))).view(a_array.dtype)
>>>  >>> backandforth
>>> array([[(0.0, 1.0, 2.0, 3.0, 4.0)],
>>>        [(1.0, 2.0, 3.0, 4.0, 5.0)]],
>>>       dtype=[('a', '<f4'), ('b', '<f4'), ('c', '<f4'), ('d', '<f4'),
>>> ('e', '<f4')])
>>>  >>> backandforth.shape
>>> (2,1)
>>>
>>> We gained a dimension !
>>>
>>
>> I looked at the archives to my first discovery of views, for sorting
>> rows proposed by Pierre. In this case reshape was not necessary.
>>
>>>>> np.sort(np.array([[4.0, 1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0, 5.0]]).view(dt),0).view(float)
>> array([[ 1.,  2.,  3.,  4.,  5.],
>>       [ 4.,  1.,  2.,  3.,  4.]])
>>
>>>>> dt
>> [('a', '<f8'), ('b', '<f8'), ('c', '<f8'), ('d', '<f8'), ('e', '<f8')]
>>
>> looking closer, the extra dimension helps to maintain shape:
>>
>> direct construction of structured array
>>
>>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt)
>> array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],
>>      dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8'), ('d', '<f8'),
>> ('e', '<f8')])
>>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt).shape
>> (2,)
>>
>> structured view on existing array is 2d
>>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).shape
>> (2, 1)
>>
>> view on view returns original shape,
>>>>> np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)]).view(dt).view(float).shape
>> (2, 5)
>>
>> But sorting in between the two views also preserved original shape.
>> This was the source about my initial confusion about the necessity of
>> reshape.
>>
>
> here is a minimal example for 2d structured array:
>
>>>> dt = dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8'), ('d', '<f8'), ('e', '<f8')]
>>>> ys = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt)
>>>> ys.shape
> (2,)
>>>> ys.view(float)
> array([ 0.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  5.])
>>>> ys = ys.reshape((len(ys),1))
>>>> ys.shape
> (2, 1)
>>>> ys.view(float)
> array([[ 0.,  1.,  2.,  3.,  4.],
>       [ 1.,  2.,  3.,  4.,  5.]])
>
>

and one more as summary: reshape, change dtype, change array type:

>>> ys = np.array([(0.0, 1.0, 2.0, 3.0, 4.0), (1.0, 2.0, 3.0, 4.0, 5.0)],dt)
>>> ys.view(float, np.matrix)
matrix([[ 0.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  5.]])
>>> ys.view(float, np.matrix).mean(0)
matrix([[ 0.,  1.,  2.,  3.,  4.,  1.,  2.,  3.,  4.,  5.]])
>>> ys.reshape(-1,1).view(float, np.matrix)
matrix([[ 0.,  1.,  2.,  3.,  4.],
        [ 1.,  2.,  3.,  4.,  5.]])
>>> ys.reshape(-1,1).view(float, np.matrix).mean(0)
matrix([[ 0.5,  1.5,  2.5,  3.5,  4.5]])
>>> ys.view(float, np.matrix).reshape(-1,len(ys.dtype)).mean(0)
matrix([[ 0.5,  1.5,  2.5,  3.5,  4.5]])

The End

Josef