[SciPy-Dev] Interest in improvements to the cKDTree codebase?

Ralf Gommers ralf.gommers at gmail.com
Sat Jun 9 16:12:27 EDT 2018


Hi Jesse,


On Tue, May 22, 2018 at 10:56 AM, Jesse Livezey <jesse.livezey at gmail.com>
wrote:

> Hi everyone,
>
> I'm using cKDTrees for a project and noticed two potential ways to improve
> the code and have written one additional count method that might be useful
> to others.
>
> I have written code and some tests for all three items and can contribute
> if there is interest.
>
> 1) Allowing an array of rs in cKDTree.query_ball_point(). I started a PR
> here <https://github.com/scipy/scipy/pull/8818>. In principle, this
> should speed up multiple queries with different rs, but see 2.
>
> 2) I noticed that for the cases when cKDTree.query_ball_point() returns
> large numbers of points (>100), it is faster to loop over queries in
> python. This is true both with and without my PR. This is largely because
> single point queries do not sort the return indices, but multi-point
> queries DO sort them (see details here
> <https://github.com/scipy/scipy/issues/8838>). Removing the sorted()
> leads to considerable speedups, but is not backwards compatible. However,
> the sorted behavior is not in the method description and is not even
> internally consistent, so maybe it could just be removed or made optional?
>

I agree with Sturla's reply on the issue; adding a keyword that defaults to
the current behavior and allows turning off sorting is probably the way to
go.


> 3) I have written a cKDTree.count_ball_point() method that behaves like
> query_ball_point() but just returns the number of points rather than their
> indices. This is much faster than calling len(query_ball_point()).
>

How much faster is it after adding the keyword for (2)?

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180609/90a18d9a/attachment.html>


More information about the SciPy-Dev mailing list