
A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
My current thinking is something like:
recarrDescr = { "name" : defineType(CharType, 16, ""), # 16-character String "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte "ADCcount" : defineType(Int16, 1, 0), # signed short integer "grid_i" : defineType(Int32, 1, 9), # integer "grid_j" : defineType(Int32, 1, 9), # integer "pressure" : defineType(Float32, 1, 1.), # float (single-precision) "temperature" : defineType(Float64, 32, arange(32)), # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long long }
where defineType is a class that accepts (type, shape, default) parameters. It can be extended safely in the future if more needs appear.
You're way ahead of me here. The only thing I don't like about this is the additional relative complexity because of the addition of field names and default values. It would be nice to layer this more.
Well, I think a map between field names and values is valuable from the user's point of view. It may help him to label the different information on the recarray. Moreover, if __getattr__ and __setattr__ methods (or __getitem__ and __setitem__) would get implemented on recarray (as they are in my recarray2 version, for example), the field name can become a very convenient manner to access a specific field by name (this introduce the limitation that field name must be a valid python identifier, but I think this is not a big restriction). By looking at the description dictionary, the user can have a quick idea of what he can find in every field (with no need of counting, which can be a big advantage specially for long records). With regard to default values, you can make this parameter (even the shape) a keyword parameter in order to make it optional. In that way, the definition can be as simple as "defineType(CharType)" (or even just "Chartype", if you add a bit of code) or as complete as "defineType(Chartype, shape, default, whatever_you_want)". I think this is a quite flexible approach.
One more thing I don't understand looking at this: a dictionary is unordered.
Yeah, but this can be regarded as an advantage rather than a drawback in the sense that you can choose the order you (the developer) prefer. For example, I was using first a alphanumerical order to arrange the data fields, but now, I'm considering that a arrangement that optimizes the alignment of the fields could be far better. As for one, say that you have a (Int8, Int32, Float64) record; in principle it could be easy to create a routine that arranges this record in the form (Float64,Int32, Int8) that optimizes the different field access (it may be even possible to introduce automatic padding later on if recarrays would support them in the future). Maybe you are getting confused in thinking that recarrDescr will create the recarray. Not at all, this a *metadata* definition that can be passed to the actual recarray funtion for recarray creation. Its function would be similar to the formats parameter (with typical values like "3a,4i,3w") in recarray.array, but with more verbosity and all the reported advantages.
instead of
((Int16, 3), (Int32, 4), (Float64, 20), )
This is pretty much exactly what I was thinking. It is straightforward to imagine and difficult to forget.
the former being more handy in lots of situations.
Would you please name some of these so we can explore handling them both ways?
Well, I'm afraid that the best advantage would be when dealing with recarrays in C extension modules. In this kind of situation it would be far better to deal with a "3a4i3w" array than a tuple of python objects. But maybe I'm wrong and the latter is not so-complicated to manage; however, I used to work a lot with records (even before meeting recarray) and I was quite comfortable with formats in string mode. Or perhaps it would be enough to provide a method for converting from the standard metadata layout (dictionary or tuple or whatever), to a string format. This should be not very difficult.
Well, if charcodes finally stay in, this have an additional advantage in that python crew has provided meaningful ways to express padding (character "x"), endianess ("=", "<", ">") and alignment ("@").
We might also add these to the type-repetition tuple.
It would be nice, of course. -- Francesc Alted