
Francesc Alted wrote:
A Divendres 24 Gener 2003 21:15, Todd Miller va escriure:
My current thinking is something like:
recarrDescr = { "name" : defineType(CharType, 16, ""), # 16-character String "TDCcount" : defineType(UInt8, 1, 0), # unsigned byte "ADCcount" : defineType(Int16, 1, 0), # signed short integer "grid_i" : defineType(Int32, 1, 9), # integer "grid_j" : defineType(Int32, 1, 9), # integer "pressure" : defineType(Float32, 1, 1.), # float (single-precision) "temperature" : defineType(Float64, 32, arange(32)), # double[32] "idnumber" : defineType(Int64, 1, 0), # signed long long }
Still think I'd prefer something seperable:
recarrStruct = ( (CharType, 16), UInt8, Int16, Int32, Int32, Float32, (Float64, 32), Int64 ) recarrFields = ["name", "TDCcount", "ADCcount", "grid_i", "grid_j", "pressure", "temperature", "idnumber"] I guess it might not be quite as good for large structs.
where defineType is a class that accepts (type, shape, default) parameters. It can be extended safely in the future if more needs appear.
You're way ahead of me here. The only thing I don't like about this is the additional relative complexity because of the addition of field names and default values. It would be nice to layer this more.
Well, I think a map between field names and values is valuable from the user's point of view. It may help him to label the different information on the recarray. Moreover, if __getattr__ and __setattr__ methods (or __getitem__ and __setitem__) would get implemented on recarray (as they are in my recarray2 version, for example), the field name can become a very convenient manner to access a specific field by name (this introduce the limitation that field name must be a valid python identifier, but I think this is not a big restriction). By looking at the description dictionary, the user can have a quick idea of what he can find in every field (with no need of counting, which can be a big advantage specially for long records).
That's true and sounds nice. I'm just thinking records with named fields should be derived from records with positional fields. If the functionality is layered, you can use as much complexity as you need. It's a good sign that both you and I thought of an identical tuple format; it's the obvious minimal one.
With regard to default values, you can make this parameter (even the shape) a keyword parameter in order to make it optional.
OK. That's a good point.
One more thing I don't understand looking at this: a dictionary is unordered.
Yeah, but this can be regarded as an advantage rather than a drawback in the sense that you can choose the order you (the developer) prefer. For example, I was using first a alphanumerical order to arrange the data fields, but now, I'm considering that a arrangement that optimizes the alignment of the fields could be far better. As for one, say that you have a (Int8, Int32, Float64) record; in principle it could be easy to create a routine that arranges this record in the form (Float64,Int32, Int8) that optimizes the different field access (it may be even possible to introduce automatic padding later on if recarrays would support them in the future).
Maybe you are getting confused
Yes and no. :)
in thinking that recarrDescr will create the recarray. Not at all, this a *metadata* definition that can be passed to the actual recarray funtion for recarray creation.
Just like the type repetition tuple except also including field names and default values. I don't think you lost me. For what we do, the exact physical layout of the "struct" is important, so order matters. I see order as part of the meta-data, but I don't usually deal with meta-entities so maybe I've got that part wrong. :)
Its function would be similar to the formats parameter (with typical values like "3a,4i,3w") in recarray.array, but with more verbosity and all the reported advantages.
instead of
((Int16, 3), (Int32, 4), (Float64, 20), )
This is pretty much exactly what I was thinking. It is straightforward to imagine and difficult to forget.
the former being more handy in lots of situations.
Would you please name some of these so we can explore handling them both ways?
Well, I'm afraid that the best advantage would be when dealing with recarrays in C extension modules. In this kind of situation it would be far better to deal with a "3a4i3w" array than a tuple of python objects. But maybe I'm wrong and the latter is not so-complicated to manage; however, I used to work a lot with records (even before meeting recarray) and I was quite comfortable with formats in string mode.
I was thinking that if the above was an issue, we could write an API function(s) to "compile" the type-repetition tuple into arrays of ints which describe the type of each field and corresponding repetition factor.
Or perhaps it would be enough to provide a method for converting from the standard metadata layout (dictionary or tuple or whatever), to a string format. This should be not very difficult.
Almost exactly what I suggested above. See you Monday, Todd