FW: [Numpy-discussion] typecodes in numarray

Francesc Alted falted at openlc.org
Sat Jan 25 04:43:02 EST 2003


A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
> >
> >My current thinking is something like:
> >
> >recarrDescr = {
> >    "name"        : defineType(CharType, 16, ""),  # 16-character String
> >    "TDCcount"    : defineType(UInt8, 1, 0),    # unsigned byte
> >    "ADCcount"    : defineType(Int16, 1, 0),    # signed short integer
> >    "grid_i"      : defineType(Int32, 1, 9),    # integer
> >    "grid_j"      : defineType(Int32, 1, 9),    # integer
> >    "pressure"    : defineType(Float32, 1, 1.),  # float 
> > (single-precision) "temperature" : defineType(Float64, 32, arange(32)), 
> > # double[32] "idnumber"    : defineType(Int64, 1, 0),    # signed long
> > long }
> >
> >where defineType is a class that accepts (type, shape, default)
> > parameters. It can be extended safely in the future if more needs appear.
>
> You're way ahead of me here.  The only thing I don't like about this is
> the additional relative complexity because of the addition of field
> names and default values.   It would be nice to layer this more.
>

Well, I think a map between field names and values is valuable from the
user's point of view. It may help him to label the different information on
the recarray. Moreover, if __getattr__ and __setattr__ methods (or
__getitem__ and __setitem__) would get implemented on recarray (as they are
in my recarray2 version, for example), the field name can become a very
convenient manner to access a specific field by name (this introduce the
limitation that field name must be a valid python identifier, but I think
this is not a big restriction). By looking at the description dictionary,
the user can have a quick idea of what he can find in every field (with no
need of counting, which can be a big advantage specially for long records).

With regard to default values, you can make this parameter (even the shape)
a keyword parameter in order to make it optional. In that way, the
definition can be as simple as "defineType(CharType)" (or even just
"Chartype", if you add a bit of code) or as complete as
"defineType(Chartype, shape, default, whatever_you_want)". I think this is
a quite flexible approach.

>One more thing I don't understand looking at this:  a dictionary is 
>unordered.

Yeah, but this can be regarded as an advantage rather than a drawback in the
sense that you can choose the order you (the developer) prefer. For example,
I was using first a alphanumerical order to arrange the data fields, but
now, I'm considering that a arrangement that optimizes the alignment of the
fields could be far better. As for one, say that you have a (Int8, Int32,
Float64) record; in principle it could be easy to create a routine that
arranges this record in the form (Float64,Int32, Int8) that optimizes the
different field access (it may be even possible to introduce automatic
padding later on if recarrays would support them in the future).

Maybe you are getting confused in thinking that recarrDescr will create the
recarray. Not at all, this a *metadata* definition that can be passed to the
actual recarray funtion for recarray creation. Its function would be
similar to the formats parameter (with typical values like "3a,4i,3w") in
recarray.array, but with more verbosity and all the reported advantages.

> >instead of
> >
> >((Int16, 3),
> > (Int32, 4),
> > (Float64, 20),
> > )
>
> This is pretty much exactly what I was thinking.  It is straightforward
> to imagine and difficult to forget.
>
> >the former being more handy in lots of situations.
>
> Would you please name some of these so we can explore handling them both
> ways?
>

Well, I'm afraid that the best advantage would be when dealing with
recarrays in C extension modules. In this kind of situation it would be far
better to deal with a "3a4i3w" array than a tuple of python objects. But
maybe I'm wrong and the latter is not so-complicated to manage; however, I
used to work a lot with records (even before meeting recarray) and I was
quite comfortable with formats in string mode.

Or perhaps it would be enough to provide a method for converting from the
standard metadata layout (dictionary or tuple or whatever), to a string
format. This should be not very difficult.

> >
> >Well, if charcodes finally stay in, this have an additional advantage in
> >that python crew has provided meaningful ways to express padding
> > (character "x"), endianess ("=", "<", ">") and alignment ("@").
>
> We might also add these to the type-repetition tuple.

It would be nice, of course.

-- 
Francesc Alted




More information about the NumPy-Discussion mailing list