[Numpy-discussion] Rec array: numpy.rec vs numpy.array with complex dtype

Fri Jun 26 16:54:32 EDT 2009

On Jun 26, 2009, at 3:59 PM, Dan Yamins wrote:
>
> Short answer:
> a np.recarray is a subclass of ndarray with structured dtype, where
> fields can be accessed has attributes (as in 'yourarray.yourfield')
> instead of as items (as in yourarray['yourfield']).
>
> Is this the only substantial thing added in the recarray class?

AFAIK, yes.

> The fact that you can access some fields via attribute notation?  We  
> haven't been using this feature anyhow ... (what happens with the  
> field names have spaces?)

Well, spaces in a field name is a bad idea, but nothing prevents you  
to do it (I wonder whether we shouldn't check for it in the definition  
of the dtype). Anyway, that will of course fail gloriously if you try  
to access it by attribute.

> Is the recarray class still being developed actively?

I don't know. There's not much you can add to it, is there ?

>
> My favorite way to get a np.recarray is to define a standard ndarray
> w/ complex dtype, and then take a view as a recarray
> Example:
>  >>>  np.array([(1,10),(2,20)],dtype=[('a',int),
> ('b',int)]).view(np.recarray)
>
> Is the purpose of this basically to use the property of recarrays of  
> accessing fields as attributes?   Or do you have other reasons why  
> you like this view?

You're correct, it's only to provide a more convenient way to access  
fields. I personally stopped using recarrays in favor of the easier  
ndarrays w/ structured dtype. If I really need to access fields as  
attributes, I'd write a subclass and make each field accesible as a  
property.

> Do you recommend a place we can learn about the interesting things  
> one can do with structured data types?  Or is the on-line  
> documentation on the scipy site the best as of now?

http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html is a good  
start. Feel free to start some tutorial page.

>
> np.rec.fromrecords processes the array and try to guess the best type
> for each field, but it's slow and not always correct
>
> Evidently.  But sometimes (in fact, a lot of times, in our  
> particular applications),  the type inference works fine and the  
> slowdown is not large enough to be noticeable.
> And of course in the recarray constructors one can override the type  
> inference by including a 'dtype' or 'formats' argument as well.    
> Obviously we can write constructor functions that include type  
> inference algorithms of our own, ... but having a "standard" way to  
> do this, with best practices maintained in the numpy core would be  
> quite useful nonetheless.

Well, you can always use the functions of the np.rec modules  
(fromfile, fromstring, fromrecords...). You can also have a look at  
np.lib.io.genfromtxt, a function to create a ndarray (or recarray, or  
MaskedArray) from a text file. I don't think overloading np.array to  
support cases like the ones you described is a good idea: I prefer to  
have some specific tools (like the np.rec functions) that one catch- 
all function.