[Numpy-discussion] Record arrays

Fri Jun 27 01:10:18 EDT 2008

On Thu, Jun 26, 2008 at 1:25 PM, Robert Kern <robert.kern at gmail.com> wrote:

> One downside of this is that the attribute access feature slows down
> all field accesses, even the r['foo'] form, because it sticks a bunch
> of pure Python code in the middle. Much code won't notice this, but if
> you end up having to iterate over an array of records (as I have),
> this will be a hotspot for you.

I wonder if it wouldn't be useful for *all* numpy arrays to have a .f
attribute that would provide attribute access to fields for complex
dtypes:

In [13]: r['foo']
Out[13]: array([1, 1, 1])

In [14]: r.f.foo
-> Hypothetically, same as [13] above

This object would be in general an empty namespace, thus avoiding the
potential for collisions that recarrays have, could normalize field
names to be valid python identifiers (spaces to _, etc) and could
provide name TAB completion. Since the .f object would be a *separate*
object, the main array wouldn't need to have complex python code in
the fast path and there would be no speed penalty for other uses of
the top level object.

I've never quite liked recarrays because of the fact that they blend
the named fields with the main namespace, and because they don't tab
complete.  I'd happily pay the price of accessing a sub-object for a
cleaner and more useful access to fields (I could always do xf=x.f if
I am really going to use the field object a lot).

Just an idea, perhaps it's already been shut down in the past.

Cheers,

f