[Numpy-discussion] Calculations with mixed type structured arrays

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Jun 4 11:30:30 EDT 2009

After yesterdays discussion, I wanted to see if views of structured
arrays with mixed type can be easily used.

Is the following useful for the numpy user guide?

Josef

Calculations with mixed type structured arrays
----------------------------------------------

>>> import numpy as np

The following array has two integer and three float columns

>>> dt = np.dtype([('a', '<i4'), ('b', '<i4'), ('c', '<f8'),
...                ('d', '<f8'), ('e', '<f8')])

>>> xs = np.ones(3,dt)
>>> print xs.shape
(3,)
>>> print repr(xs)
array([(1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0), (1, 1, 1.0, 1.0, 1.0)],
dtype=[('a', '<i4'), ('b', '<i4'), ('c', '<f8'), ('d', '<f8'),
('e', '<f8')])

If we try to view it as float then the memory of the two
integers in the record are lumped together and we get
numbers that don't represent our data correctly and we
loose one element per record

If the memory cannot be interpreted under the new dtype,

>>> print xs.view(float)
[  2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000
2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000
2.12199579e-314   1.00000000e+000   1.00000000e+000   1.00000000e+000]
>>> print xs.view(float).shape
(12,)

>>> dt0 = np.dtype([('a', '<i4'), ('c', '<f8'),
...                ('d', '<f8'), ('e', '<f8')])
>>> np.ones(3,dt0).view(float)
Traceback (most recent call last):
ValueError: new type not compatible with array.

However, we can construct a new dtype that creates
views on the integer part and the float part separately

>>> dt2 = np.dtype([('A', '<i4',2), ('B', '<f8', 3)])

>>> print repr(xs.view(dt2))
array([([1, 1], [1.0, 1.0, 1.0]), ([1, 1], [1.0, 1.0, 1.0]),
([1, 1], [1.0, 1.0, 1.0])],
dtype=[('A', '<i4', 2), ('B', '<f8', 3)])

Now we are able to access the two subarrays and perform
calculations with them

>>> print xs.view(dt2)['B'].mean(0)
[ 1.  1.  1.]
>>> print xs.view(dt2)['A'].mean(0)
[ 1.  1.]

We can also assign new names to the two views and calculate
(almost) as if they were regular arrays.
The new variables are still only a view on the original
memory. If we change them, then also the original
structured array changes:

>>> xva = xs.view(dt2)['A']
>>> xvb = xs.view(dt2)['B']

>>> xva *= range(1,3)
>>> xvb[:,:] = xvb*range(1,4)

>>> print xs
[(1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0) (1, 2, 1.0, 2.0, 3.0)]
>>> print xva.mean(0)
[ 1.  2.]
>>> print xvb.mean(0)
[ 1.  2.  3.]