[Numpy-discussion] Records in scipy core
Travis Oliphant
oliphant.travis at ieee.org
Fri Dec 2 10:37:03 EST 2005
Perry Greenfield wrote:
>
>
>For us, probably not critical since we have to do some rewriting anyway.
>(But it would be nice to retain for a while as deprecated).
>
>
Easy enough to do by defining an actual record array (however, see
below). I've been retaining backwards compatibility in other ways
while not documenting it. For example, you can actually now pass in
strings like 'Int32' for types.
>But what about field names that don't map well to attributes?
>I haven't had a chance to reread the past emails but I seem to
>recall this was a significant issue. That would imply that .field()
>would be needed for those cases anyway.
>
>
What I'm referring to as the solution here is a slight modification to
what Perry described. In other words, all arrays have the attribute
.fields
You can set this attribute to a dictionary which will automagically
gives field names to any array (this dictionary has ordered lists of
'names', (optionally) 'titles', and "(data-descr, [offset])" lists which
defines the mapping. If offset is not given, then the "next-available"
offset is assumed. The data-descr is either 1) a data-type or 2) a
tuple of (data-type, shape). The data-type is either a defined
data-type or alias, or an object with a .fields attribute that provides
the same dictionary and an .itemsize attribute that computes the total
size of the data-type.
You can get this attribute which returns a special fields object
(written in Python initially like the flags attribute) that can look up
field names like a dictionary, or with attribute access for names that
are either 1) acceptable or 2) have a user-provided "python-name"
associated with them.
Thus,
.fields['home address']
would always work
but
.fields.hmaddr
would only work if the user had previously made the association hmaddr
-> 'home address' for the data type of this array. Thus 'home address'
would be a title but hmaddr would be the name.
The records module would simply provide functions for making record
arrays and a record data type.
Driving my thinking is the concept that the notion of a record array is
really a description of the data type of the array (not the array
itself). Thus, all the fields information should really just be part of
the data type itself. Now, I don't really want to create and register a
new data type every time somebody has a new record layout.
So, I've been re-thinking the notion of "registering a data-type". It
seems to me that while it's O.K. to have a set of pre-defined data
types. The notion of data-type ought to be flexible enough to allow the
user to define one "on-the-fly".
I'm thinking of ways to do this right now. Any suggestions are welcome.
-Travis
More information about the NumPy-Discussion
mailing list