However: is there an automatic way to convert a named index to a position? What about looping over tuples of my recarray: for t in d: date = t['Date'] .... I guess that the above does have to lookup 'Date' each time. But the following does not need the hash lookup for each tuple: for t in d: date = t[0] .... Should I create a map from dtype.names(), and use that to look up the index based on the name in advance? (if I really really want to factorize out the lookup of 'Date'] On Wed, Jul 21, 2010 at 3:47 PM, wheres pythonmonks <wherespythonmonks@gmail.com> wrote:
Thank you very much.... better crack open a numpy reference manual instead of relying on my python "intuition".
On Wed, Jul 21, 2010 at 3:44 PM, Pauli Virtanen <pav@iki.fi> wrote:
Wed, 21 Jul 2010 15:12:14 -0400, wheres pythonmonks wrote:
I have an recarray -- the first column is date.
I have the following function to compute the number of unique dates in my data set:
def byName(): return(len(list(set(d['Date'])) ))
What this code does is:
1. d['Date']
Extract an array slice containing the dates. This is fast.
2. set(d['Date'])
Make copies of each array item, and box them into Python objects. This is slow.
Insert each of the objects in the set. Also this is somewhat slow.
3. list(set(d['Date']))
Get each item in the set, and insert them to a new list. This is somewhat slow, and unnecessary if you only want to count.
4. len(list(set(d['Date'])))
So the slowness arises because the code is copying data around, and boxing it into Python objects.
You should try using Numpy functions (these don't re-box the data) to do this. http://docs.scipy.org/doc/numpy/reference/routines.set.html
-- Pauli Virtanen
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion