Re: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated

28 Jul 2004

      On Wed, 28 Jul 2004 12:00:40 +0200
Francesc Alted <falted@pytables.org> wrote:
...
A Dimarts 27 Juliol 2004 22:04, gerard.vermeulen@grenoble.cnrs.fr va escriure:
...
Introducing recordArray["column"] as an alternative for
recordArray.field("column") breaks a symmetry between for instance 1-d
record arrays and 2-d normal arrays. (the symmetry is strongly suggested
by their representation: a record array prints almost as a list of tuples
and a 2-d normal array almost as a list of lists).
Indexing a column of a 2-d normal array is done by normalArray[:, column],
so why not recArray[:, "column"] ?
Well, I must recognize that this has its beauty (by revealing the simmetry
that you mentioned). However, mixing integer and strings on indices can
be, in my opinion, rather confusing for most people. Then, I guess that
the implementation wouldn't be easy.
...
I prefer to use
recordArray.column[32]
and/or
recordArray[32].column
rather than recordArray["column"][32].
I would prefer better:
recordArray.fields.column[32]
or
recordArray.cols.column[32]
(note the use of the plural in fields and cols, which I think is more
consistent about its functionality)
The problem with:
recordArray[32].fields.column
is that I don't see it as natural and besides, completion capabilities
would be broken after the [] parenthesis.
Two points:

1. This is true for vanilla Python but not for IPython-0.6.2:

packer@zombie:~> ipython
Python 2.3+ (#1, Jan  7 2004, 09:17:35)
Type "copyright", "credits" or "license" for more information.

IPython 0.6.2 -- An enhanced Interactive Python.
?       -> Introduction to IPython's features.
@magic  -> Information about IPython's 'magic' @ functions.
help    -> Python's own help system.
object? -> Details about 'object'. ?object also works, ?? prints more.

In [1]: d = {'Francesc': 0}

In [2]: d['Francesc'].__a
d['Francesc'].__abs__  d['Francesc'].__add__  d['Francesc'].__and__

In [2]: d['Francesc'].__a

   You see, the completion mechanism of ipython recognizes d['Francesc'] as an
   integer.

2. If one accepts that a "field_name" can be used as an attribute, one must be
   able to say:

   record.field_name ( == record.field("field_name") )

   and (since recordArray[32] returns a record) also:

   recordArray[32].field_name

   and not

   recordArray[32].cols.field_name (sorry, I abhor this)
...
Anyway, as Russell suggested, I don't like recordArray["column"][32],
because it would be unnecessary (you can get same result using
recordArray[column_idx][32]).
Thank you for this little slip, you mean recordArray["column"][32] is
recordArray[32][column_idx], isn't it?
...
Although I recognize that a recordArray.cols["column"][32] would not hurt
my eyes so much. This is because although indices continues to mix ints
and strings, the difference is that ".cols" is placed first, giving a new
(and unmistakable) meaning to the "column" index.
I am just worried that future generalization of indexing will be impossible
if the meaning of an indexing operation ("get row" or "get column or field")
depends on the fact that an index is a string or an integer: IMO the meaning
should depend on the position in the index list.

The example has been choosen to show that I don't mind indexing by strings at
all. If I see array[13, 'ab', 31, 'ba'], I know that 'ab' and 'ba' index record
fields as long as the indices are in 'normal' order.

Nevertheless, I am aware that Utopia may be hard to implement efficiently, but
this reflects my mental picture of nested (record) arrays.

(ipython in Utopia would me allow to figure out array[13].ab[31].ba by tab
 completion and I would translate this to array[13, 'ab', 31, 'ba'] for
 efficiency in a real program)

I think that we agree that recordArray.cols["column"] is better than
recordArray["column"], but I don't see why recordArray.cols["column"] is
better than the original recordArray.field("column").

Cheers -- Gerard

PS: after reading the above, there may be a case to accept only indexing
    which can be read from left to right, so
    recordArray[32].field_name is OK, but recordArray.field_name[32] is not.

Re: [Numpy-discussion] Proposed record array behavior: the rest of the story: updated

Gerard Vermeulen