[Numpy-discussion] recarray field names

Erin Sheldon erin.sheldon at gmail.com
Wed Mar 15 16:37:23 EST 2006


Nice.  Python decides to compare with the keys and not the values.

The possibilities for obfuscation are endless.

On 3/15/06, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
> Erin Sheldon wrote:
>
> > Yes, I see, but I think you meant
> >
> >     if name in t.dtype.fields.keys():
>
> No, he really meant:
>
> if name in t.dtype.fields:
>
> dictionaries are iterators, so you don't need to construct the list of keys
> separately.  It's just a redundant waste of time and memory in most cases,
> unless you intend to modify the dict in your loop, case in which the iterator
> approach won't work and you /do/ need the explicit keys() call.
>
> In addition
>
> if name in t.dtype.fields
>
> is faster than:
>
> if name in t.dtype.fields.keys()
>
> While both are O(N) operations, the first requires a single call to the hash
> function on 'name' and then a C lookup in the dict's internal key table as a
> hash table, while the second is a direct walkthrough of a list with
> python-level equality testing.
>
> In [15]: nkeys = 1000000
>
> In [16]: dct = dict(zip(keys,[None]*len(keys)))
>
> In [17]: time bool(-1 in keys)
> CPU times: user 0.01 s, sys: 0.00 s, total: 0.01 s
> Wall time: 0.01
> Out[17]: False
>
> In [18]: time bool(-1 in dct)
> CPU times: user 0.00 s, sys: 0.00 s, total: 0.00 s
> Wall time: 0.00
> Out[18]: False
>
>
> In realistic cases for your original question you are not likely to see the
> difference, but it's always a good idea to be aware of the performance
> characteristics of various approaches.  For a different problem, there may
> well be a real difference.
>
> Cheers,
>
> f
>




More information about the NumPy-Discussion mailing list