Mailman 3 Re: Scipy Core (Was Numeric 3.0) - NumPy-Discussion

11 Dec 2005

      Travis,

This is intended to restore the context of your response. TO is Travis 
Oliphant, CW is Colin Williams

TO:

    The dtype parameter is still used and it is still called the same
    thing.   It's just that what constitutes a data-type has changed
    significantly.
    For example now tuples and dictionaries can be used to describe a
    data-type.  These definitions are recursive so that whenever
    data-type is used it means anything that can be interpreted as a
    data-type.  And I really mean data-descriptor, but data-type is in
    such common usage that I still use it.

CW:

    This would appear to be a good step forward but with all of the
    immutable types (int8, FloatType, TupleType, etc.) the data is
    stored in the ArrayType instance (array_data?) whereas, with a
    dictionary, it would appear to be necessary to store the items
    outside the array.  Is that desirable?

TO:

    I don't even understand what you are questioning here.   Perhaps you
    misunderstood me.  Items are always stored in the array.  Even
    records "items" are stored in the array.   The descriptor just
    allows you to describe what kind of data and how it is stored in the
    array.

Response:

    Yes, I had assumed that int8 and FloatType are equally
    data-descriptors, i.e. objects which describe the elements of a
    numeric array.  Wrongly,  I assumed that TupleType or DictType are
    also a data descriptors.

    Mea culpa. 

    On another subject, it would help, for those of us who are not
    afficionados of C, if you provided the equivalent Python def
    statement for those function implemented in C, especially the
    ArrayType's __new__ method. perhaps in the docstrings.

Colin W.

Travis Oliphant wrote:
...
Colin J. Williams wrote:
...
This would appear to be a good step forward but with all of the 
immutable types (int8, FloatType, TupleType, etc.) the data is stored 
in the ArrayType instance (array_data?) whereas, with a dictionary, 
it would appear to be necessary to store the items outside the 
array.  Is that desirable?
I don't even understand what you are questioning here.   Perhaps you 
misunderstood me.  Items are always stored in the array.  Even records 
"items" are stored in the array.   The descriptor just allows you to 
describe what kind of data and how it is stored in the array.
...
Even the tuple can have its content modified, as the example below 
shows:
I don't understand what you mean to show by the 
tuple-content-modifying example.
...
...
dtype=(int32, (5,5))   ---  a 5x5 array of int32 is the description 
of this item.
dtype=(str, 10) --- a length-10 string
So dtype now contains both the data type of each element and the 
shape of the array?  This seems a significant change from numarray or 
Numeric.
No,  no.   Standard usage is the same.   In normal usage you would not 
create an array this way.  You could, of course, but it's not the 
documented procedure.    The reason for this  descriptor is to allow 
you to have a field-element that itself is an array of items.
...
...
dtype=(int16, {'real':(int8,0),'imag':(int8,4)}  --- a descriptor 
that acts
like an int16 array mathematically
(in ufuncs) but has real and imag
fields.
This adds complexity, is there a compensating benefit?  Do all of the 
complex operations apply?
I'm only showing what is possible and that the notion of data-type 
descriptor is complete.
...
Why not clean things up by dropping typechar?  These seemed to be one 
of the warts in numarray, only carried forward for
compatibility reasons.  Could the compatibility objectives of the 
project not be achieved, outside the ArrayType object, with a wrapper 
of some sort?
Too hard to do at this point.  Too much code uses the characters.   I 
also don't mind the characters so much (the struct module and the 
Python array module use them).
...
Couldn't the use of records avoid the cumbersome use of keys?
Yes.  But, this is how you specify fields generally.
...
...
format2 (and how it's stored internally)
{key1 : (data-type1, offset1 [, title1]),
 key2 : (data-type2, offset2 [, title2]),
  ...
 keyn : (data-typen, offsetn [, titlen])
}
This is cleaner, but couldn't this inormation be contained within the 
Record instance?
Yes, but I am describing a general concept of data-type description 
not just something applicable to records.
...
...
Thus, you can do something like this:
...
...
...
a = ones((4,3), dtype=(int16, {'real':(int8, 0), 'imag':(int8, 
1)}))
a['imag'] = 2
a['real'] = 1
a.tostring()
'\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02'
Or, one could have something like:
class SmallComplex(Record):
..''' This class typically has no instances in user code. '''
..real= (int8, )
..imag= (int8)
..def __init__(self):
....
..def __new__(self):
....
...
...
...
a = ones((4,3), dtype= SmallComplex)
a.imag = 2
a.real = 1
a.tostring()
'\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02\x01\x02'
Yes, something like this should be possible, though I have not fleshed 
out the user-interface at all.  I've just been working on the basic 
structure that would support such things.  Do:
class small_complex(record):
        dtypedescr = {'r':(int8,0),'i':(int8,1)}
a = ndrecarray((4,3), formats=small_complex)
a.r = 1
a.i = 2
a.tostring()
and your example would work (with current SVN...)
The ndrecarray subclass allows the attribute-to-field conversion, 
regular arrays do not.
-Travis
-Travis

Re: Scipy Core (Was Numeric 3.0)

Colin J. Williams

tags

participants (1)