[SciPy-Dev] cKDTree

Thu Sep 8 17:12:54 EDT 2016

Hi Pauli, Ralf,

On Thu, Sep 8, 2016 at 8:26 PM, Pauli Virtanen <pav at iki.fi> wrote:
>
> > Just imagine: I have a new uniformly filling sequence, but no proof that
> > it is pseudo-random, and I don't even know the discrepancy of the
> > sequence,
> > but a bunch of examples for which "it works"...  Well, I doubt that
> > anyone would want to use it for Monte-Carlo simulation / cryptography
> > etc...
>
> Are you presenting this as a fair analogy to DE and the literature
> about it?
>
> > In any case, I find this question on the need of a scientific
> > justification worthy to be answered in general - especially in the
> > context of the discussions on scope and the 1.0 release.
>
> Yes, justification that an approach works and that is regarded as useful
> is required. This is not really the place to do completely original
> research.
>
> If I understand you correctly, you are saying that you would not
> generally recommend DE as a global optimization method, because there are
> superior choices? Or, are you saying it's essentially crock?

That is not really it. I was mostly pointing DE as an example which is sort
of in a gray area, in order to ask the questions of scope, criterion for
inclusion into scipy etc.

When it was included into scipy, it triggered my attention since I had
worked on flavors of DE in the past (It is used as an alternative to lloyd
algorithms for optimal quantization). There is indeed some literature about
the applications in this area. So I did find it useful, and found that it
does work well for the problems I used it for. (However the sort of generic
implementation proposed today in scipy would not have been a good fit in
this case.)

My understanding of what scipy's scope is, is that it is a collection of
routines that are robust reference implementations of well established
numerical methods that are of general interest regardless of the area of
applications. Linear algebra, interpolation, quadrature, random number
generation, convex optimization, specialized optimizers for dense LP and QP
etc... In each ones of these areas, if you need something more specialized,
you should probably used a specialized library or implement something
ad-hoc to your use case.

Evolutionary optimization algorithms don't seem to fall into this category
for the reasons that we discussed earlier. It is mostly a set of
heuristics. It is cool, inspired by nature, etc. (however, a number of
citations is probably not a substitute for a mathematical proof...) The
other methods that I listed for stochastic optimization would have been
more natural candidates to fall into the "category" that I roughly
described above, in that they are extremely well established and backed by
theory. I imagine that the inclusion of DE into Scipy could have been
questioned at the time, but that now that it is in there, it should
probably not be removed without a good alternative. Finally, I am still
curious about what can be considered a bug or a feature in the case of a
method like this.

On the subject of the use of a faster flavor of KdTree as I was proposing,
I was only gauging interest. The long discussion on DE on this specific
thread is mostly coincidental. My main goal was to use it as an example for
the question of the scope - also to ask about the "big split" idea. If
there was to be a split, a potential scipy-incubator organization with
proposals for inclusion as scipy subprojects would make sense then...

Sylvain
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20160908/ee19fcac/attachment.html>