
r[2:4] array( [(3, 33.0, 'c'), (4, 44.0, 'd')],
Hi, As Perry said not too long ago that numarray crew would ask for suggestions for RecArray improvements, I'm going to suggest a couple. I find quite inconvenient the .tolist() method when applied to RecArray objects as it is now: formats=['1UInt8', '1Float32', '1a1'], shape=2, names=['c1', 'c2', 'c3'])
r[2:4].tolist() [<numarray.records.Record instance at 0x406a946c>, <numarray.records.Record instance at 0x406a912c>]
The suggested behaviour would be:
r[2:4].tolist() [(3, 33.0, 'c'),(4, 44.0, 'd')]
Another thing is that an element of recarray would be returned as a tuple instead as a records.Record object:
r[2] <numarray.records.Record instance at 0x4064074c>
The suggested behaviour would be:
r[2] (3, 33.0, 'c')
I think the latter would be consistent with the convention that a __getitem__(int) of a NumArray object returns a python type instead of a rank-0 array. In the same way, a __getitem__(int) of a RecArray should return a a python type (a tuple in this case). Below is the code that I use right now to simulate this behaviour, but it would be nice if the code would included into numarray.records module. def tolist(arr): """Converts a RecArray or Record to a list of rows""" outlist = [] if isinstance(arr, records.Record): for i in range(arr.array._nfields): outlist.append(arr.array.field(i)[arr.row]) outlist = tuple(outlist) # return a tuple for records elif isinstance(arr, records.RecArray): for j in range(arr.nelements()): tmplist = [] for i in range(arr._nfields): tmplist.append(arr.field(i)[j]) outlist.append(tuple(tmplist)) return outlist Cheers, -- Francesc Alted

Francesc Alted wrote:
As Perry said not too long ago that numarray crew would ask for suggestions for RecArray improvements, I'm going to suggest a couple.
I find quite inconvenient the .tolist() method when applied to RecArray objects as it is now:
r[2:4] array( [(3, 33.0, 'c'), (4, 44.0, 'd')], formats=['1UInt8', '1Float32', '1a1'], shape=2, names=['c1', 'c2', 'c3']) r[2:4].tolist() [<numarray.records.Record instance at 0x406a946c>, <numarray.records.Record instance at 0x406a912c>]
The suggested behaviour would be:
r[2:4].tolist() [(3, 33.0, 'c'),(4, 44.0, 'd')]
Another thing is that an element of recarray would be returned as a tuple instead as a records.Record object:
r[2] <numarray.records.Record instance at 0x4064074c>
The suggested behaviour would be:
r[2] (3, 33.0, 'c')
I think the latter would be consistent with the convention that a __getitem__(int) of a NumArray object returns a python type instead of a rank-0 array. In the same way, a __getitem__(int) of a RecArray should return a a python type (a tuple in this case).
These are good examples of where improvements are needed (we are also looking at how best to handle multidimensional arrays and should have a proposal this week). What I'm wondering about is what a single element of a record array should be. Returning a tuple has an undeniable simplicity to it. On the other hand, we've been using recarrays that allow naming the various columns (which we refer to as "fields"). If one can refer to fields of a recarray, shouldn't one be able to refer to a field (by name) of one of it's elements? Or are you proposing that basic recarrays not have that sort of capability (something added by a subclass)? Perry

At 5:14 PM -0400 2004-07-12, Perry Greenfield wrote:
What I'm wondering about is what a single element of a record array should be. Returning a tuple has an undeniable simplicity to it. On the other hand, we've been using recarrays that allow naming the various columns (which we refer to as "fields"). If one can refer to fields of a recarray, shouldn't one be able to refer to a field (by name) of one of it's elements? Or are you proposing that basic recarrays not have that sort of capability (something added by a subclass)?
In my opinion, an single item of a record array should be a RecordItem object that is a dictionary that keeps items in field order. Thus: - use the standard dictionary interface to deal with values by name (except the keys are always in the correct order. - one can also get and set the all data at once as a tuple. This is NOT a standard dictionary interface, but is essential. Functions such as getvalues(), setvalues(dataTuple) should do it. Adopting the full dictionary interface means one gets a standard, mature and fairly complete set of features. ALSO a RecordItem object can then be used wherever a dictionary object is needed. I suspect it's also useful to have named field access: RecordItem.fieldname but am a bit reluctant to suggest so many different ways of getting to the data. I assume it will continue to be easy to get all data for a field by naming the appropriate field. That's a really nice feature. It would be even better if a masked array could be used, but I have no idea how hard this would be. Which brings up a side issue: any hope of integrating masked arrays into numarray, such that they could be used wherever a numarray array could be used? Areas that I particularly find myself needing them including nd_image filtering and writing C extensions. -- Russell P.S. I submitted several feature requests and bug reports for records on sourceforge months ago. I hope they'll not be overlooked during the review process.

A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
What I'm wondering about is what a single element of a record array should be. Returning a tuple has an undeniable simplicity to it.
Yeah, this why I'm strongly biased toward this possibility.
On the other hand, we've been using recarrays that allow naming the various columns (which we refer to as "fields"). If one can refer to fields of a recarray, shouldn't one be able to refer to a field (by name) of one of it's elements? Or are you proposing that basic recarrays not have that sort of capability (something added by a subclass)?
Well, I'm not sure about that. But just in case most of people would like to access records by field as well as by index, I would advocate for the possibility that the Record instances would behave as similar as possible as a tuple (or dictionary?). That include creating appropriate __str__() *and* __repr__() methods as well as __getitem__() that supports both name fields and indices. I'm not sure about whether providing an __getattr__() method would ok, but for the sake of simplicity and in order to have (preferably) only one way to do things, I would say no. Regards, -- Francesc Alted

A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
What I'm wondering about is what a single element of a record array should be. Returning a tuple has an undeniable simplicity to it.
Yeah, this why I'm strongly biased toward this possibility.
On the other hand, we've been using recarrays that allow naming the various columns (which we refer to as "fields"). If one can refer to fields of a recarray, shouldn't one be able to refer to a field (by name) of one of it's elements? Or are you proposing that basic recarrays not have that sort of capability (something added by a subclass)?
Well, I'm not sure about that. But just in case most of people would like to access records by field as well as by index, I would advocate for the possibility that the Record instances would behave as similar as possible as a tuple (or dictionary?). That include creating appropriate __str__() *and* __repr__() methods as well as __getitem__() that supports both name fields and indices. I'm not sure about whether providing an __getattr__() method would ok, but for the sake of simplicity and in order to have (preferably) only one way to do things, I would say no.
I've been thinking that one can made compatible to return a tuple on a single element of a RecArray and still being able to retrieve a field by name is to play with the RecArray.__getitem__ and let it to suport key names in addition to indices. This would be better seen as an example: Right now, one can say:
r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8") r._fields["c1"] array([1, 2]) r._fields["c1"][1] 2
What I propose is to be able to say:
r["c1"] array([1, 2]) r["c1"][1] 2
Which would replace the notation:
r[1]["c1"] 2
which was recently suggested. I.e. the suggestion is to realize RecArrays as a collection of columns, as well as a collection of rows. -- Francesc Alted

Francesc Alted wrote:
A Dimarts 13 Juliol 2004 10:28, Francesc Alted va escriure:
A Dilluns 12 Juliol 2004 23:14, Perry Greenfield va escriure:
What I'm wondering about is what a single element of a record array should be. Returning a tuple has an undeniable simplicity to it.
Yeah, this why I'm strongly biased toward this possibility.
On the other hand, we've been using recarrays that allow naming the various columns (which we refer to as "fields"). If one can refer to fields of a recarray, shouldn't one be able to refer to a field (by name) of one of it's elements? Or are you proposing that basic recarrays not have that sort of capability (something added by a subclass)?
Well, I'm not sure about that. But just in case most of people would like to access records by field as well as by index, I would advocate for the possibility that the Record instances would behave as similar as possible as a tuple (or dictionary?). That include creating appropriate __str__() *and* __repr__() methods as well as __getitem__() that supports both name fields and indices. I'm not sure about whether providing an __getattr__() method would ok, but for the sake of simplicity and in order to have (preferably) only one way to do things, I would say no.
I've been thinking that one can made compatible to return a tuple on a single element of a RecArray and still being able to retrieve a field by name is to play with the RecArray.__getitem__ and let it to suport key names in addition to indices. This would be better seen as an example:
Right now, one can say:
r=records.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8") r._fields["c1"]
array([1, 2])
r._fields["c1"][1]
2
What I propose is to be able to say:
r["c1"]
array([1, 2])
r["c1"][1]
I would suggest going a step beyond this, so that one can have r.c1[1], see the script below. I have not explored the assignment of a value to r.c1.[1], but it seems to be achievable. If changes along this line are acceptable, it is suggested that fields be renamed cols, or some such, to indicate its wider impact.
Colin W.
2
Which would replace the notation:
r[1]["c1"]
2
which was recently suggested.
I.e. the suggestion is to realize RecArrays as a collection of columns, as well as a collection of rows.
# tRecord.py to explore RecArray import numarray.records as _rec import sys # class Rec1(_rec.RecArray): def __new__(cls, buffer, formats, shape=0, names=None, byteoffset=0, bytestride=None, byteorder=sys.byteorder, aligned=0): # This calls RecArray.__init__ - reason unclear. # Why can't the instance be fully created by RecArray.__init__? return _rec.RecArray.__new__(cls, buffer, formats=formats, shape=shape, names=names, byteorder=byteorder, aligned=aligned) def __init__(self, buffer, formats, shape=0, names=None, byteoffset=0, bytestride=None, byteorder=sys.byteorder, aligned=0): arr= _rec.array(buffer, formats=formats, shape=shape, names=names, byteorder=byteorder, aligned=aligned) self.__setstate__(arr.__getstate__()) def __getattr__(self, name): # We reach here if the attribute does not belong to the basic Rec1 set return self._fields[name] def __getattribute__(self, name): return _rec.RecArray.__getattribute__(self, name) def __repr__(self): return self.__class__.__name__ + _rec.RecArray.__repr__(self)[8:] def __setattr__(self, name, value): return _rec.RecArray.__setattr__(self, name, value) def __str__(self): return self.__class__.__name__ + _rec.RecArray.__str__(self)[8:] if __name__ == '__main__': # Frances Alted 13-Jul-04 05:06 r= _rec.array([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8") print r._fields["c1"] print r._fields["c1"][1] r1= Rec1([(1,"asds", 24.),(2,"pwdw", 48.)], "1i4,1a4,1f8") print r1._fields["c1"] print r1._fields["c1"][1] # r1.zz= 99 # acceptable print r1.c1 print r1.c1[1] try: x= r1.ugh except: print 'ugh not recognized as an attribute' ''' The above delivers: [1 2] 2 [1 2] 2 [1 2] 2 ugh not recognized as an attribute '''

A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
What I propose is to be able to say:
r["c1"][1] I would suggest going a step beyond this, so that one can have r.c1[1], see the script below.
Yeah. I've implemented something similar to access column elements for pytables Table objects. However, the problem in this case is that there are already attributes that "pollute" the column namespace, so that a column named "size" collides with the size() method. I came up with a solution by adding a new "cols" attribute to the Table object that is an instance of a simple class named Cols with no attributes that can pollute the namespace (except some starting by "__" or "_v_"). Then, it is just a matter of provide functionality to access the different columns. In that case, when a reference of a column is made, another object (instance of Column class) is returned. This Column object is basically an accessor to column values with a __getitem__() and __setitem__() methods. That might sound complicated, but it is not. I'm attaching part of the relevant code below. I personally like that solution in the context of pytables because it extends the "natural naming" convention quite naturally. A similar approach could be applied to RecArray objects as well, although numarray might (and probably do) have other usage conventions.
I have not explored the assignment of a value to r.c1.[1], but it seems to be achievable.
in the schema I've just proposed the next should be feasible: value = r.cols.c1[1] r.cols.c1[1] = value -- Francesc Alted ----------------------------------------------------------------- class Cols(object): """This is a container for columns in a table It provides methods to get Column objects that gives access to the data in the column. Like with Group instances and AttributeSet instances, the natural naming is used, i.e. you can access the columns on a table like if they were normal Cols attributes. Instance variables: _v_table -- The parent table instance _v_colnames -- List with all column names Methods: __getitem__(colname) """ def __init__(self, table): """Create the container to keep the column information. table -- The parent table """ self.__dict__["_v_table"] = table self.__dict__["_v_colnames"] = table.colnames # Put the column in the local dictionary for name in table.colnames: self.__dict__[name] = Column(table, name) def __len__(self): return self._v_table.nrows def __getitem__(self, name): """Get the column named "name" as an item.""" if not isinstance(name, types.StringType): raise TypeError, \ "Only strings are allowed as keys of a Cols instance. You passed object: %s" % name # If attribute does not exist, return None if not name in self._v_colnames: raise AttributeError, \ "Column name '%s' does not exist in table:\n'%s'" % (name, str(self._v_table)) return self.__dict__[name] def __str__(self): """The string representation for this object.""" # The pathname pathname = self._v_table._v_pathname # Get this class name classname = self.__class__.__name__ # The number of columns ncols = len(self._v_colnames) return "%s.cols (%s), %s columns" % (pathname, classname, ncols) def __repr__(self): """A detailed string representation for this object.""" out = str(self) + "\n" for name in self._v_colnames: # Get this class name classname = getattr(self, name).__class__.__name__ # The shape for this column shape = self._v_table.colshapes[name] # The type tcol = self._v_table.coltypes[name] if shape == 1: shape = (1,) out += " %s (%s%s, %s)" % (name, classname, shape, tcol) + "\n" return out class Column(object): """This is an accessor for the actual data in a table column Instance variables: table -- The parent table instance name -- The name of the associated column Methods: __getitem__(key) """ def __init__(self, table, name): """Create the container to keep the column information. table -- The parent table instance name -- The name of the column that is associated with this object """ self.table = table self.name = name # Check whether an index exists or not iname = "_i_"+table.name+"_"+name self.index = None if iname in table._v_parent._v_indices: self.index = Index(where=self, name=iname, expectedrows=table._v_expectedrows) else: self.index = None def __getitem__(self, key): """Returns a column element or slice It takes different actions depending on the type of the "key" parameter: If "key" is an integer, the corresponding element in the column is returned as a NumArray/CharArray, or a scalar object, depending on its shape. If "key" is a slice, the row slice determined by this slice is returned as a NumArray or CharArray object (whatever is appropriate). """ if isinstance(key, types.IntType): if key < 0: # To support negative values key += self.table.nrows (start, stop, step) = processRange(self.table.nrows, key, key+1, 1) return self.table._read(start, stop, step, self.name, None)[0] elif isinstance(key, types.SliceType): (start, stop, step) = processRange(self.table.nrows, key.start, key.stop, key.step) return self.table._read(start, stop, step, self.name, None) else: raise TypeError, "'%s' key type is not valid in this context" % \ (key) def __str__(self): """The string representation for this object.""" # The pathname pathname = self.table._v_pathname # Get this class name classname = self.__class__.__name__ # The shape for this column shape = self.table.colshapes[self.name] if shape == 1: shape = (1,) # The type tcol = self.table.coltypes[self.name] return "%s.cols.%s (%s%s, %s)" % (pathname, self.name, classname, shape, tcol) def __repr__(self): """A detailed string representation for this object.""" return str(self)

Francesc Alted wrote:
A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
What I propose is to be able to say:
r["c1"][1] I would suggest going a step beyond this, so that one can have r.c1[1], see the script below.
Yeah. I've implemented something similar to access column elements for pytables Table objects. However, the problem in this case is that there are already attributes that "pollute" the column namespace, so that a column named "size" collides with the size() method.
The idea of mapping field names to attributes occurs to everyone quickly, but for the reasons Francesc gives (as well as another I'll mention) we were reluctant to implement it. The other reason is that it would be nice to allow field names that are not legal attributes (e.g., that include spaces or other illegal attribute characters). There are potentially people with data in databases or other similar formats that would like to map field name exactly. Well certainly one can still use the attribute approach and not support all field names (or column, or col...) it does introduce another glitch in the user interface when it works only for a subset of legal names.
I came up with a solution by adding a new "cols" attribute to the Table object that is an instance of a simple class named Cols with no attributes that can pollute the namespace (except some starting by "__" or "_v_"). Then, it is just a matter of provide functionality to access the different columns. In that case, when a reference of a column is made, another object (instance of Column class) is returned. This Column object is basically an accessor to column values with a __getitem__() and __setitem__() methods. That might sound complicated, but it is not. I'm attaching part of the relevant code below.
I personally like that solution in the context of pytables because it extends the "natural naming" convention quite naturally. A similar approach could be applied to RecArray objects as well, although numarray might (and probably do) have other usage conventions.
I have not explored the assignment of a value to r.c1.[1], but it seems to be achievable.
in the schema I've just proposed the next should be feasible:
value = r.cols.c1[1] r.cols.c1[1] = value
This solution avoids name collisions but doesn't handle the other problem. This is worth considering, but I thought I'd hear comments about the other issue before deciding it (there is also the "more than one way" issue as well; but this guideline seems to bend quite often to pragmatic concerns). We're still chewing on all the other issues and plan to start floating some proposals, rationales and questions before long. Perry

A Dijous 15 Juliol 2004 19:37, Perry Greenfield va escriure:
formats that would like to map field name exactly. Well certainly one can still use the attribute approach and not support all field names (or column, or col...) it does introduce another glitch in the user interface when it works only for a subset of legal names.
Yep. I forgot that issue. My particular workaround on that was to provide an optional trMap dictionary during Table (in our case, RecArray) creation time to map those original names that are not valid python names by valid ones. That would read something like:
r=records.array([(1,"as")], "1i4,1a2", names=["c 1", "c2"], trMap={"c1": "c 1"})
that would indicate that the "c 1" column which is not a valid python name (it has an space in the middle) can be accessed by using "c1" string, which is a valid python id. That way, r.cols.c1 would access column "c 1". And although I must admit that this solution is not very elegant, it allows to cope with those situations where the columns are not valid python names. -- Francesc Alted

Perry Greenfield wrote:
Francesc Alted wrote:
A Dijous 15 Juliol 2004 17:21, Colin J. Williams va escriure:
What I propose is to be able to say:
>r["c1"][1] > > I would suggest going a step beyond this, so that one can have r.c1[1], see the script below.
Yeah. I've implemented something similar to access column elements for pytables Table objects. However, the problem in this case is that there are already attributes that "pollute" the column namespace, so that a column named "size" collides with the size() method.
The idea of mapping field names to attributes occurs to everyone quickly, but for the reasons Francesc gives (as well as another I'll mention) we were reluctant to implement it. The other reason is that it would be nice to allow field names that are not legal attributes (e.g., that include spaces or other illegal attribute characters). There are potentially people with data in databases or other similar formats that would like to map field name exactly. Well certainly one can still use the attribute approach and not support all field names (or column, or col...) it does introduce another glitch in the user interface when it works only for a subset of legal names.
It would, I suggest, not be unduly restrictive to bar the existing attribute names but, if that's not acceptable, Francesc has suggested the.col workaround, although I would prefer to avoid the added clutter. Incidentally, there is no current protection against wiping out an existing method: [Dbg]>>> r1.size= 0 [Dbg]>>> r1.size 0 [Dbg]>>>
I came up with a solution by adding a new "cols" attribute to the Table object that is an instance of a simple class named Cols with no attributes that can pollute the namespace (except some starting by "__" or "_v_"). Then, it is just a matter of provide functionality to access the different columns. In that case, when a reference of a column is made, another object (instance of Column class) is returned. This Column object is basically an accessor to column values with a __getitem__() and __setitem__() methods. That might sound complicated, but it is not. I'm attaching part of the relevant code below.
I personally like that solution in the context of pytables because it extends the "natural naming" convention quite naturally. A similar approach could be applied to RecArray objects as well, although numarray might (and probably do) have other usage conventions.
I have not explored the assignment of a value to r.c1.[1], but it seems to be achievable.
in the schema I've just proposed the next should be feasible:
value = r.cols.c1[1] r.cols.c1[1] = value
This solution avoids name collisions but doesn't handle the other problem. This is worth considering, but I thought I'd hear comments about the other issue before deciding it (there is also the "more than one way" issue as well; but this guideline seems to bend quite often to pragmatic concerns).
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name. Colin W.
We're still chewing on all the other issues and plan to start floating some proposals, rationales and questions before long.
Perry

A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name.
That's not so easy. What about other chars like '/&%@$()' that cannot be part of python names? Finding a biunivocal map between them and allowed chars would be difficult (if possible at all). Besides, the resulting colnames might become a real mess. Regards, -- Francesc Alted

Francesc Alted wrote:
A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name.
That's not so easy. What about other chars like '/&%@$()' that cannot be part of python names? Finding a biunivocal map between them and allowed chars would be difficult (if possible at all). Besides, the resulting colnames might become a real mess.
Regards,
Yes, if the objective is to include special characters or facilitate multi-lingual columns names and it probably should be, then my suggestion is quite inadequate. Perhaps there could be a simple name -> column number mapping in place of _names. References to a column, or a field in a record, could then be through this dictionary. Basic access to data in a record would be by position number, rather than name, but the dictionary would facilitate access by name. Data could be referenced either through the column name: r1.c2[1] or through the record r1[1].c2, with the possibility that the index is multi-dimensional in either case. Colin W.

A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name.
That's not so easy. What about other chars like '/&%@$()' that cannot be part of python names? Finding a biunivocal map between them and allowed chars would be difficult (if possible at all). Besides, the resulting colnames might become a real mess.
Personally, I think the idea of allowing access to fields via attributes is fatally flawed. The problems raised (non-obvious mapping between field names with special characters and allowable attribute names and also the collision with existing instance variable and method names) clearly show it would be forced and non-pythonic. The obvious solution seems to be some combination of the dict interface (an ordered dict that keeps its keys in original field order) and the list interface. My personal leaning is: - Offer most of the dict methods, including __get/setitem__, keys, values and all iterators but NOT set_default pop_item or anything else that adds or deletes a field. - Offer the list version of __get/setitem__, as well, but NONE of list's methods. - Make the default iterator iterate over values, not keys (field names), i.e have the item act like a list, not a dict when used as an iterator. In other words, the following all work (where item is one element of a numarray.record array): item[0] = 10 # set value of field 0 to 10 x = item[0:5] # get value of fields 0 through 4 item[:] = list of replacement values item["afield"] = 10 "%s(afield)" % item the methods iterkeys, itervalues, iteritems, keys, values, has_key all work the method update might work, but it's an error to add new fields -- Russell P.S. Folks are welcome to use my ordered dictionary implementation RO.Alg.OrderedDictionary, which is part of the RO package <http://www.astro.washington.edu/rowen/ROPython.html>. It is fully standalone (despite its location in my hierarchy) and is used in production code.

Russell E Owen wrote:
A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name.
That's not so easy. What about other chars like '/&%@$()' that cannot be part of python names? Finding a biunivocal map between them and allowed chars would be difficult (if possible at all). Besides, the resulting colnames might become a real mess.
Personally, I think the idea of allowing access to fields via attributes is fatally flawed. The problems raised (non-obvious mapping between field names with special characters and allowable attribute names and also the collision with existing instance variable and method names) clearly show it would be forced and non-pythonic.
+1 It also make it difficult to do the following: a = item[:10, ('age', 'surname', 'firstname')] where field (or column) 1 is 'firstname, field 2 is 'surname', and field 10 is 'age'. -- Paul -- Paul Barrett, PhD Space Telescope Science Institute Phone: 410-338-4475 ESS/Science Software Branch FAX: 410-338-4767 Baltimore, MD 21218

Paul Barrett wrote:
Russell E Owen wrote:
A Divendres 16 Juliol 2004 02:21, Colin J. Williams va escriure:
To allow for multi-word column names, assignment could replace a space by an underscore and, in retrieval, the reverse could be done - ie. underscore would be banned for a column name.
That's not so easy. What about other chars like '/&%@$()' that cannot be part of python names? Finding a biunivocal map between them and allowed chars would be difficult (if possible at all). Besides, the resulting colnames might become a real mess.
Personally, I think the idea of allowing access to fields via attributes is fatally flawed. The problems raised (non-obvious mapping between field names with special characters and allowable attribute names and also the collision with existing instance variable and method names) clearly show it would be forced and non-pythonic.
+1
Paul, Below, I've appended my response to Francesc's 08:36 message, it was copied to the list but does not appear in the archive.
It also make it difficult to do the following:
a = item[:10, ('age', 'surname', 'firstname')]
where field (or column) 1 is 'firstname, field 2 is 'surname', and field 10 is 'age'.
-- Paul
Could you clarify what you have in mind here please? Is this a proposed extension to records.py, as it exists in version 1.0? Colin W. ------------------------------------------------------------------------ Yes, if the objective is to include special characters or facilitate multi-lingual columns names and it probably should be, then my suggestion is quite inadequate. Perhaps there could be a simple name -> column number mapping in place of _names. References to a column, or a field in a record, could then be through this dictionary. Basic access to data in a record would be by position number, rather than name, but the dictionary would facilitate access by name. Data could be referenced either through the column name: r1.c2[1] or through the record r1[1].c2, with the possibility that the index is multi-dimensional in either case. Colin W.
participants (5)
-
Colin J. Williams
-
Francesc Alted
-
Paul Barrett
-
Perry Greenfield
-
Russell E Owen