[scikit-learn] Nearest neighbor search with 2 distance measures

Jacob Vanderplas jakevdp at cs.washington.edu
Tue Aug 1 13:25:52 EDT 2017


Hi Rohin,
Ah, I see. I don't think a BallTree is the right data structure for an
anisotropic N-point query, because it fundamentally assumes spherical
symmetry of the metric. You may be able to do something like this with a
specialized KD-tree, but scikit-learn doesn't support this, and I don't
imagine that it ever will given the very specialized nature of the
application.

I'm certain someone has written efficient code for this operation in the
astronomy community, but I don't know of any good Python package to
recommend for this – I'd suggest googling for keywords and seeing where
that gets you.

Thanks,
  Jake

 Jake VanderPlas
 Senior Data Science Fellow
 Director of Open Software
 University of Washington eScience Institute

On Tue, Aug 1, 2017 at 6:15 AM, Rohin Kumar <yrohinkumar at gmail.com> wrote:

> Since you seem to be from Astrophysics/Cosmology background (I am assuming
> you are jakevdp - the creator of astroML - if you are - I am lucky!), I can
> explain my application scenario. I am trying to calculate the anisotropic
> two-point correlation function something like done in rp_pi_tpcf
> <http://halotools.readthedocs.io/en/latest/api/halotools.mock_observables.rp_pi_tpcf.html#halotools.mock_observables.rp_pi_tpcf>
>  or s_mu_tpcf
> <http://halotools.readthedocs.io/en/latest/api/halotools.mock_observables.s_mu_tpcf.html#halotools.mock_observables.s_mu_tpcf>
> using pairs (DD,DR,RR) calculated from BallTree.two_point_correlation
>
> In halotools (http://halotools.readthedocs.io/en/latest/function_usage/
> mock_observables_functions.html) it is implemented using rectangular
> grids. I could calculate 2pcf with custom metrics using one variable with
> BallTree as done in astroML. I intend to find the anisotropic counter part.
>
> Thanks & Regards,
> Rohin
>
> Y.Rohin Kumar,
> +919818092877 <+91%2098180%2092877>.
>
> On Tue, Aug 1, 2017 at 5:18 PM, Rohin Kumar <yrohinkumar at gmail.com> wrote:
>
>> Dear Jake,
>>
>> Thanks for your response. I meant to group/count pairs in boxes (using
>> two arrays simultaneously-hence needing 2 metrics) instead of one distance
>> array as the binning parameter. I don't know if the algorithm supports such
>> a thing. For now, I am proceeding with your suggestion of two ball trees at
>> huge computational cost. I hope I am able to frame my question properly.
>>
>> Thanks & Regards,
>> Rohin.
>>
>>
>>
>> On Mon, Jul 31, 2017 at 8:16 PM, Jacob Vanderplas <
>> jakevdp at cs.washington.edu> wrote:
>>
>>> On Sun, Jul 30, 2017 at 11:18 AM, Rohin Kumar <yrohinkumar at gmail.com>
>>> wrote:
>>>
>>>> *update*
>>>>
>>>> May be it doesn't have to be done at the tree creation level. It could
>>>> be using loops and creating two different balltrees. Something like
>>>>
>>>> tree1=BallTree(X,metric='metric1') #for x-z plane
>>>> tree2=BallTree(X,metric='metric2') #for y-z plane
>>>>
>>>> And then calculate correlation functions in a loop to get tpcf(X,r1,r2)
>>>>  using tree1.two_point_correlation(X,r1) and
>>>> tree2.two_point_correlation(X,r2)
>>>>
>>>
>>> Hi Rohin,
>>> It's not exactly clear to me what you wish the tree to do with the two
>>> different metrics, but in any case the ball tree only supports one metric
>>> at a time. If you can construct your desired result from two ball trees
>>> each with its own metric, then that's probably the best way to proceed,
>>>    Jake
>>>
>>>
>>>>
>>>> _______________________________________________
>>>> scikit-learn mailing list
>>>> scikit-learn at python.org
>>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>>
>>>>
>>>
>>> _______________________________________________
>>> scikit-learn mailing list
>>> scikit-learn at python.org
>>> https://mail.python.org/mailman/listinfo/scikit-learn
>>>
>>>
>>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170801/09b22a0f/attachment-0001.html>


More information about the scikit-learn mailing list