[Numpy-discussion] can int and float exists in one array?(about difference in indexing Matlab matrix and Numpy array)
Tim Hochberg
tim.hochberg at ieee.org
Mon Nov 27 21:33:34 EST 2006
Zhang Sam wrote:
> Thanks for so many replies.
>
> In fact, I want to use several arrays to store the original data from
> a practical project. In every arrays, two or three column will be
> store the index. The main computation is still on matrices(float type)
> which is built from the original data. When building the main
> matrix, I need the repeated use of the index stored in the original
> data. So I hope both int and float can exist in one array with numpy,
> just for the original data.
>
> Before Python, I used matlab and fortran. In matlab, it is just I have
> said. In fortran, a module can used for storing different data type.
> I used Python just now, so I don't know which one in python is best
> for my case. What's the suggestion?
>
Here are slightly more fleshed out suggestions:
1. Break the indices out into a separate matrices. That is in instead of
one matrix 'x' containing both the indices and and data, have two
matrices: 'x_indices' of type int and 'x_data' of type float. This is
probably what I would do, at least given my limited knowledge of the
problem, since with suitable names for the two indices this is likeliest
to be the clearest. For example:
x_indices = np.array([2, 1]), dtype=int)
x_data = np.array([[2.5, 3.5], [2.6, 3.5]], dtype=float)
2. Use record arrays. This allows you to pack different types into a
single matrix, but you then need to refer to the different fields
(formerly columns) by name:
import numpy as np
my_dtype = np.dtype([('indices',int), ('data_1', float),
('data_2', float)])
x = np.array([(2, 2.5, 3.5), (1, 2.6, 3.5)], dtype=my_dtype)
print x['indices']
print x['data_1']
print x['data_2']
Whether this makes any sense will depend on your actual use case.
3. Use object arrays as already suggested.
There are several other approaches I can think of (for example, if you
are willing to swap rows and columns, you could create a tuple of
arrays). However, in the absence of some compelling reason to do
otherwise, I'd use #1.
-tim
More information about the NumPy-Discussion
mailing list