On Mon, Oct 5, 2009 at 7:29 PM, Anne Archibald <peridot.faceted@gmail.com>wrote:
2009/10/5 Charles R Harris <charlesr.harris@gmail.com>:
On Mon, Oct 5, 2009 at 3:20 PM, Anne Archibald <
peridot.faceted@gmail.com>
wrote:
2009/10/5 David Goldsmith <d.l.goldsmith@gmail.com>:
Just curious, Anne: have you anything in particular in mind (i.e., are there some small - or gaping - holes in scipy (IYO, of course) which you
know
could be filled by a careful implementation of something(s) extant in the literature)?
Well, not exactly - the examples I had in mind were minor and/or in the past. I ran into problems with scipy's hyp2f1, for example, so I went and looked up the best algorithm I could find for it (and I think I contributed that code). I wanted the Kuiper test as an alternative to the Kolmogorov-Smirnov test (it's invariant under cyclic permutations, and is sensitive to different features of the distribution) so I looked up the test and the special function needed to interpret its results. (I haven't contributed this to scipy yet, mostly because I chose an interface that's not particularly compatible with that for scipy's K-S test.) And on a larger scale, that's what scipy.spatial's kdtree implementation is.
For examples where I think a bit of lit review plus implementation work might help, I'd say that the orthogonal polynomials could use some work - the generic implementation in scipy.special falls apart rapidly as you go to higher orders. I always implement my own Chebyshev polynomials using the cos(n*arccos(x)) expression, for example, and special implementations for the others might be very useful.
At what order does the scipy implementation of the Chebyshev polynomials fall apart? I looked briefly at that package a long time ago, but never used it. I ask so I can check the chebyshev module that is going into numpy.
By n=30 they are off by as much as 0.0018 on [-1,1]; n=31 they are off by 0.1, and by n=35 they are off by four - not great for values that should be in the interval [-1,1]. This may seem like an outrageously high degree for a polynomial, but there's no reason they need be this bad, and one could quite reasonably want to use an order this high, say for function approximation.
Hmm, I get an maximum error of about 1e-13 for n=100 using my routine, which isn't great and can be improved a bit with a few tricks, but is probably good enough. That's using simple Clenshaw recursion for the evaluation. There must be something seriously wrong with the scipy version. I routinely use fits with n > 50 because truncating such a series gives a good approximation to the minmax fit and it's also nice to see how the coefficients fall off to estimate the size of the truncation error. I think this inaccuracy is probably inevitable in a scheme that
computes values using a recurrence relation
Not so, using the Cheybshev recurrence in either direction should be stable for |x| <= 1. It's like multiplying with a complex number of modulus 1, i.e., a unitary matrix.
and something like it probably occurs for all the orthogonal polynomials that don't have special-purpose evaluators.
Depends on the recursion, the direction of the recursion, and the domain. Chuck