[Numpy-discussion] Re: Scipy Core (Was Numeric 3.0)

Colin J. Williams cjw at sympatico.ca
Sun Dec 11 06:53:01 EST 2005


Travis,

This is intended to restore the context of your response. TO is Travis 
Oliphant, CW is Colin Williams

TO:

    The dtype parameter is still used and it is still called the same
    thing.   It's just that what constitutes a data-type has changed
    significantly.
    For example now tuples and dictionaries can be used to describe a
    data-type.  These definitions are recursive so that whenever
    data-type is used it means anything that can be interpreted as a
    data-type.  And I really mean data-descriptor, but data-type is in
    such common usage that I still use it.

CW:

    This would appear to be a good step forward but with all of the
    immutable types (int8, FloatType, TupleType, etc.) the data is
    stored in the ArrayType instance (array_data?) whereas, with a
    dictionary, it would appear to be necessary to store the items
    outside the array.  Is that desirable?

TO:

    I don't even understand what you are questioning here.   Perhaps you
    misunderstood me.  Items are always stored in the array.  Even
    records "items" are stored in the array.   The descriptor just
    allows you to describe what kind of data and how it is stored in the
    array.

Response:

    Yes, I had assumed that int8 and FloatType are equally
    data-descriptors, i.e. objects which describe the elements of a
    numeric array.  Wrongly,  I assumed that TupleType or DictType are
    also a data descriptors.

    Mea culpa. 

    On another subject, it would help, for those of us who are not
    afficionados of C, if you provided the equivalent Python def
    statement for those function implemented in C, especially the
    ArrayType's __new__ method. perhaps in the docstrings.

Colin W.

Travis Oliphant wrote:

> Colin J. Williams wrote:
>
>>
>> This would appear to be a good step forward but with all of the 
>> immutable types (int8, FloatType, TupleType, etc.) the data is stored 
>> in the ArrayType instance (array_data?) whereas, with a dictionary, 
>> it would appear to be necessary to store the items outside the 
>> array.  Is that desirable?
>
>
> I don't even understand what you are questioning here.   Perhaps you 
> misunderstood me.  Items are always stored in the array.  Even records 
> "items" are stored in the array.   The descriptor just allows you to 
> describe what kind of data and how it is stored in the array.
>
>>
>> Even the tuple can have its content modified, as the example below 
>> shows:
>>
> I don't understand what you mean to show by the 
> tuple-content-modifying example.
>
>>> dtype=(int32, (5,5))   ---  a 5x5 array of int32 is the description 
>>> of this item.
>>> dtype=(str, 10) --- a length-10 string
>>
>>
>>
>> So dtype now contains both the data type of each element and the 
>> shape of the array?  This seems a significant change from numarray or 
>> Numeric.
>
>
> No,  no.   Standard usage is the same.   In normal usage you would not 
> create an array this way.  You could, of course, but it's not the 
> documented procedure.    The reason for this  descriptor is to allow 
> you to have a field-element that itself is an array of items. 
>
>>
>>
>>> dtype=(int16, {'real':(int8,0),'imag':(int8,4)}  --- a descriptor 
>>> that acts
>>>                                                                               
>>> like an int16 array mathematically
>>>                                                                               
>>> (in ufuncs) but has real and imag
>>>                                                                        
>>>      
>>> fields.                                                                              
>>>
>>>
>> This adds complexity, is there a compensating benefit?  Do all of the 
>> complex operations apply?
>
>
> I'm only showing what is possible and that the notion of data-type 
> descriptor is complete.
>
>>
>> Why not clean things up by dropping typechar?  These seemed to be one 
>> of the warts in numarray, only carried forward for
>> compatibility reasons.  Could the compatibility objectives of the 
>> project not be achieved, outside the ArrayType object, with a wrapper 
>> of some sort?
>
>
> Too hard to do at this point.  Too much code uses the characters.   I 
> also don't mind the characters so much (the struct module and the 
> Python array module use them).
>
>>
>> Couldn't the use of records avoid the cumbersome use of keys?
>
>
> Yes.  But, this is how you specify fields generally.
>
>>
>>> format2 (and how it's stored internally)
>>>
>>> {key1 : (data-type1, offset1 [, title1]),
>>>  key2 : (data-type2, offset2 [, title2]),
>>>   ...
>>>  keyn : (data-typen, offsetn [, titlen])
>>> }
>>>
>> This is cleaner, but couldn't this inormation be contained within the 
>> Record instance?
>
>
>
> Yes, but I am describing a general concept of data-type description 
> not just something applicable to records.
>
>>> Thus, you can do something like this:
>>>
>>> >>> a = ones((4,3), dtype=(int16, {'real':(int8, 0), 'imag':(int8, 
>>> 1)}))
>>> >>> a['imag'] = 2
>>> >>> a['real'] = 1
>>> >>> a.tostring()
>>> '\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02' 
>>>
>>>
>> Or, one could have something like:
>> class SmallComplex(Record):
>> ..''' This class typically has no instances in user code. '''
>> ..real= (int8, )
>> ..imag= (int8)
>> ..def __init__(self):
>> ....
>> ..def __new__(self):
>> ....
>>
>> >>> a = ones((4,3), dtype= SmallComplex)
>> >>> a.imag = 2
>> >>> a.real = 1
>> >>> a.tostring()
>> '\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02' 
>>
>
>
> Yes, something like this should be possible, though I have not fleshed 
> out the user-interface at all.  I've just been working on the basic 
> structure that would support such things.  Do:
>
> class small_complex(record):
>         dtypedescr = {'r':(int8,0),'i':(int8,1)}
>
> a = ndrecarray((4,3), formats=small_complex)
> a.r = 1
> a.i = 2
> a.tostring()
>
> and your example would work (with current SVN...)
>
> The ndrecarray subclass allows the attribute-to-field conversion, 
> regular arrays do not.
>
> -Travis
>
>
>
>
>
>
>
>
>
> -Travis
>
>





More information about the NumPy-Discussion mailing list