[Numpy-discussion] Records in scipy core

Fri Dec 2 12:05:01 EST 2005

Travis Oliphant wrote:

> Perry Greenfield wrote:
>
>>  
>>
>> For us, probably not critical since we have to do some rewriting anyway.
>> (But it would be nice to retain for a while as deprecated).
>>  
>>
> Easy enough to do by defining an actual record array (however, see 
> below).   I've been retaining backwards compatibility in other ways 
> while not documenting it.  For example, you can actually now pass in 
> strings like 'Int32' for types.
>
>> But what about field names that don't map well to attributes?
>> I haven't had a chance to reread the past emails but I seem to
>> recall this was a significant issue. That would imply that .field()
>> would be needed for those cases anyway.
>>  
>>
> What I'm referring to as the solution here is a slight modification to 
> what Perry described.  In other words, all arrays have the attribute
>
> .fields

What I suggested in my posting was that there is no need and no benefit 
from the .fields attribute.

The base class Record could be organized so that certain attributes 
which are used in arrays are not acceptable.  For example, one would 
probably wish to avoid shape, size and the other attributes of the basic 
array but attributes associated with arrays with numeric types would 
probably not need to be barred.

>
> You can set this attribute to a dictionary which will automagically 
> gives field names to any array (this dictionary has ordered lists of 
> 'names', (optionally) 'titles', and "(data-descr, [offset])" lists 
> which defines the mapping.  If offset is not given, then the 
> "next-available" offset is assumed.  The data-descr is either 1) a 
> data-type or 2) a tuple of (data-type, shape).   The data-type is 
> either a defined data-type or alias, or an object with a .fields 
> attribute that provides the same dictionary and an .itemsize attribute 
> that computes the total size of the data-type.
>
I wonder about the need for explicit dictionary operations.  Can't this 
be handled through the class structure?

>
> You can get this attribute which returns a special fields object 
> (written in Python initially like the flags attribute) that can look 
> up field names like a dictionary, or with attribute access for names 
> that are either 1) acceptable or 2) have a user-provided "python-name" 
> associated with them. 
> Thus,
>
> .fields['home address']
>
> would always work
>
> but
>
> .fields.hmaddr
>
> would only work if the user had previously made the association hmaddr 
> -> 'home address' for the data type of this array.   Thus 'home 
> address' would be a title but hmaddr would be the name.
>
> The records module would simply provide functions for making record 
> arrays and a record data type.
> Driving my thinking is the concept that the notion of a record array 
> is really a description of the data type of the array (not the array 
> itself).  Thus, all the fields information should really just be part 
> of the data type itself.  Now, I don't really want to create and 
> register a new data type every time somebody has a new record layout.
>
A record array is an array which has a record as its data element, in 
the same way that an integer array has an integer as its element.

I don't understand the notion of registring a data type.  Presumably an 
integer array has a pointer to the appropriate type of integer.  Could 
the record array not have a pointer to the appropriate record type?

> So, I've been re-thinking the notion of "registering a data-type".  It 
> seems to me that while it's O.K. to have a set of pre-defined data 
> types.  The notion of data-type ought to be flexible enough to allow 
> the user to define one "on-the-fly".
> I'm thinking of ways to do this right now.  Any suggestions are welcome.
>
The record types would be created "on-the-fly" as the class is 
instatiated.  The array, through the dtype parameter would point to the 
record type.

>
> -Travis

Colin W.