[Numpy-discussion] Re: Bytes Object and Metadata
Travis Oliphant
oliphant at ee.byu.edu
Wed Mar 30 11:49:03 EST 2005
>
> After more thought, I think using the struct-like typecharacters is
> not a good idea for the array protocol. I think that the character
> codes used by the numarray record array: kind_character + byte_width
> is better. Commas can separate heterogeneous data. The problem is
> that if the data buffer originally came from a different machine or
> saved with a different compiler (e.g. a mmap'ed file), then the
> struct-like typecodes only tell you the c-type that machine thought
> the data was. It does not tell you how to interpret the data on this
> machine.
> So, I think we should use the __array_typestr__ method to pass type
> information using the kind_character + byte_width method. I'm also
> going to use this type information for pickles, so that arrays pickled
> on one machine type will be able to be interpreted on another with ease.
>
> Bool -- "b%d" % sizeof(bool)
> Signed Integer -- "i%d" % sizeof(<some int>)
> Unsigned Integer -- "u%d" % sizeof(<some uint>)
> Float -- "f%d" % sizeof(<some float>)
> Complex -- "c%d" % sizeof(<some complex>)
> Object -- "O%d" % sizeof(PyObject *) --- this
> would only be useful on shared memory
> String -- "S%d" % itemsize
> Unicode -- "U%d" % itemsize
> Void -- "V%d" % itemsize
Of course with this protocol for the typestr, the array_itemsize is
redundant and can disappear. Another reason to like it.
> I also think that rather than attach < or > to the start of the string
> it would be easier to have another protocol for endianness. Perhaps
> something like:
> __array_endian__ (optional Python integer with the value 1 in it).
> If it is not 1, then a byteswap must be necessary.
I'm mixed on this, I could be persuaded either way.
-Travis
More information about the NumPy-Discussion
mailing list