[Numpy-discussion] Proposed record array behavior: the rest of the story: updated

Colin J. Williams cjw at sympatico.ca
Tue Jul 27 11:22:27 EDT 2004


Russell E Owen wrote:

> At 5:41 PM -0400 2004-07-26, Colin J. Williams wrote:
>
>> Russell E Owen wrote:
>>
>>>  At 11:43 AM -0400 2004-07-26, Perry Greenfield wrote:
>>>
>>>>  I'll try to see if I can address all the comments raised (please 
>>>> let me know
>>>>  if I missed something).
>>>>  ...(nice proposal elided)...
>>>>  Any comments on these changes to the proposal? Are there those 
>>>> that are
>>>>  opposed to supporting attribute access?
>>>
>>>
>>>
>>>  Overall this sounds great.
>>>
>>>  However, I am still strongly against attribute access.
>>>
>>>  Attributes are usually meant for names that are intrinsic to the 
>>> design of an object, not to the user's "configuration" of the object.
>>
>>
>> Russell, I hope that you will elaborate this distinction between 
>> design and usage.  On the face of it, I would have though that the 
>> two should be closely related.
>
>
> To my mind, the design of an object describes the intended behavior of 
> the object: what kind of data can it deal with and what should it do 
> to that data. It tends to be "static" in the sense that it is not a 
> function of how the object is created or what data is contained in the 
> object. The design of the object usually drives the choice of the 
> attributes of the object (variables and methods).
>
> On the other hand, the user's "configuration" of the object is what 
> the user has done to make a particular instance of an object unique -- 
> the data the user has been loaded into the object.
>
> I consider the particular named fields of a record array to fall into 
> the latter category. But it is a gray area. Somebody else might argue 
> that the record array constructors is an object factory, turning out 
> an object designed by the user. From that alternative perspective, 
> adding attributes to represent field names is perhaps more natural as 
> a design.
>
> I think the main issues are:
> - Are there too many ways to address things? (I say yes) 

This could be true.  I guess the test is whether there is a rational 
justification for each way.

>
> - Field name mapping: there is no trivial 1:1 mapping between valid 
> field names and valid attribute names. 

If one starts with the assumption that field/attribute names are 
compatible with Python names, then I don't see that this is a problem.  
The question has been raised as to whether a wider range of names should 
be permitted e.g.. including such characters as ~`()!çéë.  My view is 
that such characters should be considered acceptable for data labels, 
but not for data names. i.e. they are for display, not for manipulation.

>
> - Nested access. Not sure about this one, but I'd like to hear more. 

A RecArray is made of of a number of records, each of the same length 
and data configuration.  Each field of a record is of fixed length and 
type.  It wouldn't be a big leap to permit another record in one of the 
fields.

Suppose we have an address record aRec and a personnel record pRec and 
that rArr is an array of pRec.
aRec
  street: a30
  city:a20
  postalCode: a7

pRec
  id: i4
  firstName: a15
  lastName: a20
  homeAddress: aRec
  workAddress: aRec

Then rArr[16].homeAddress.city could give us the hime city for person 16 
in rArr

>
>
> If we do end up with attributes for field names, I really like Rick 
> White's suggestion of adding an attribute for a field only if the 
> field name is already a valid attribute name. That neatly avoids the 
> collision issue and is simple to document.
>
> -- Russell 

Best wishes,

Colin W.

>
>





More information about the NumPy-Discussion mailing list