Re: [SciPy-dev] Volunteer for Scipy Project

5 Oct 2009

      On Mon, Oct 5, 2009 at 7:29 PM, Anne Archibald <peridot.faceted@gmail.com>wrote:
...
2009/10/5 Charles R Harris <charlesr.harris@gmail.com>:
...
On Mon, Oct 5, 2009 at 3:20 PM, Anne Archibald <
peridot.faceted@gmail.com>
...
wrote:
...
2009/10/5 David Goldsmith <d.l.goldsmith@gmail.com>:
...
Just curious, Anne: have you anything in particular in mind (i.e., are
there
some small - or gaping - holes in scipy (IYO, of course) which you
know
...
...
could be filled by a careful implementation of something(s) extant in
the
literature)?
Well, not exactly - the examples I had in mind were minor and/or in
the past. I ran into problems with scipy's hyp2f1, for example, so I
went and looked up the best algorithm I could find for it (and I think
I contributed that code). I wanted the Kuiper test as an alternative
to the Kolmogorov-Smirnov test (it's invariant under cyclic
permutations, and is sensitive to different features of the
distribution) so I looked up the test and the special function needed
to interpret its results. (I haven't contributed this to scipy yet,
mostly because I chose an interface that's not particularly compatible
with that for scipy's K-S test.) And on a larger scale, that's what
scipy.spatial's kdtree implementation is.
For examples where I think a bit of lit review plus implementation
work might help, I'd say that the orthogonal polynomials could use
some work - the generic implementation in scipy.special falls apart
rapidly as you go to higher orders. I always implement my own
Chebyshev polynomials using the cos(n*arccos(x)) expression, for
example, and special implementations for the others might be very
useful.
At what order does the scipy implementation of the Chebyshev polynomials
fall apart? I looked briefly at that package a long time ago, but never
used
it. I ask so I can check the chebyshev module that is going into numpy.
By n=30 they are off by as much as 0.0018 on [-1,1]; n=31 they are off
by 0.1, and by n=35 they are off by four - not great for values that
should be in the interval [-1,1]. This may seem like an outrageously
high degree for a polynomial, but there's no reason they need be this
bad, and one could quite reasonably want to use an order this high,
say for function approximation.
Hmm, I get an maximum error of about 1e-13 for n=100 using my routine, which
isn't great and can be improved a bit with a few tricks, but is probably
good enough.  That's using simple Clenshaw recursion for the evaluation.
There must be something seriously wrong with the scipy version. I routinely
use fits with n > 50 because truncating such a series gives a good
approximation to the minmax fit and it's also nice to see how the
coefficients fall off to estimate the size of the truncation error.

I think this inaccuracy is probably inevitable in a scheme that
...
computes values using a recurrence relation
Not so, using the Cheybshev recurrence in either direction should be stable
for |x| <= 1. It's like multiplying with a complex number of modulus 1,
i.e., a unitary matrix.
...
and something like it probably occurs for all the orthogonal polynomials
that don't have
special-purpose evaluators.
Depends on the recursion, the direction of the recursion, and the domain.

Chuck