[Numpy-discussion] record arrays initialization

Aronne Merrelli aronne.merrelli at gmail.com
Wed May 2 23:03:15 EDT 2012


On Wed, May 2, 2012 at 6:46 PM, Kevin Jacobs <jacobs at bioinformed.com>
<bioinformed at gmail.com> wrote:
> On Wed, May 2, 2012 at 7:25 PM, Aronne Merrelli <aronne.merrelli at gmail.com>
> wrote:
>>
>> In general this is a good suggestion - I was going to mention it
>> earlier - but I think for this particular problem it is not better
>> than the "brute force" and argmin() NumPy approach. On my laptop, the
>> KDTree query is about a factor of 7 slower (ignoring the time cost to
>> create the KDTree)
>>
>
> The cKDTree implementation is more than 4 times faster than the brute-force
> approach:
>
> T = scipy.spatial.cKDTree(targets)
>
> In [11]: %timeit foo1(element, targets)   # Brute force
> 1000 loops, best of 3: 385 us per loop
>
> In [12]: %timeit foo2(element, T)         # cKDTree
> 10000 loops, best of 3: 83.5 us per loop
>
> In [13]: 385/83.5
> Out[13]: 4.610778443113772

Wow, not sure how I missed that! It even seems to scale better than
linear (some of that 83us is call overhead, I guess):

In [35]: %timeit foo2(element, T)
10000 loops, best of 3: 115 us per loop
In [36]: elements = np.random.uniform(0,8,[128,512,7])
In [37]: %timeit foo2(elements.reshape((128*512,7)), T)
1 loops, best of 3: 2.66 s per loop

So only 2.7 seconds to search the whole set. Not bad!

Cheers,
Aronne



More information about the NumPy-Discussion mailing list