[Numpy-discussion] [ANN]: Taylorpoly, an implementation of vectorized Taylor polynomial operations and request for opinions

Sat Feb 27 18:30:43 EST 2010

2010/2/27 Sebastian Walter <sebastian.walter at gmail.com>:
> I'm sorry this comment turns out to be confusing.

Maybe it's not important.

> It has apparently quite the contrary effect of what I wanted to achieve:
> Since there is already a polynomial module  in numpy I wanted to
> highlight their difference
> and stress that they are used to do arithmetic, e.g. compute
>
> f([x],[y]) = [x] * (sin([x])**2 + [y])
>
> in Taylor arithmetic.

That's cool!  You didn't mention that.  Now I step by step find out
what your module (package?) is for.  You are a mathematician?  Many
physicists complain that mathematicians cannot make their point ;-)

I think I can use that to make my upy accept arbitrary functions, but
how do you apply sin() to a TTP?

One more question: You said, t is an "external paramter".  I, and
maybe not only me, interpreted this as a complicated name for
"variable".  So I assumed it will be a parameter to some method of the
TTP.  But it isn't?  It's just the way to define the ring?  You could
define it the same in Fourier space, except that you have to make the
array large enough from the beginning?  Why not doing that, and
saying, your computation relies on the Fourier transform of the
representation?  Can this give insight why TTPs are a ring and why
they have zero divisors?

>>>>  In fact, you /have/ to provide
>>>> external binary operators, because I guess you also want to have
>>>> numpy.ndarrays as left operand.  In that case, the undarray will have
>>>> higher precedence, and will treat your data structure as a scalar,
>>>> applying it to all the undarray's elements.
>>>
>>> well, actually it should treat it as a scalar since the Taylor
>>> polynomial is something like a real or complex number.
>>
>> Maybe I misunderstood you completely, but didn't you want to implement
>> arrays of polynomials using numpy?  So I guess you want to apply a
>> vector from numpy pairwise to the polynomials in the P-object?
>
> no, the vectorization is something different. It's purpose becomes
> only clear when applied in Algorithmic Differentiation.

Hey folks, here's a cool package, but the maintainer didn't tell us! ;-)

>  E.g. if you have a function
> f: R^N -> R
> x -> y=f(x)
> where x = [x1,...,xN]
>
> and you want to compute the gradient g(x) of f(x), then you can compute
> df(x)/dxn by propagating  the following array of Taylor polynomials:
>
> x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ...,
> UTPS([xN_0,0]), dtype=object)
> y = f(x)

So what is the result of applying f to some UTPS instance, is it a
plain number, is an UTPS again?  How do you calculate?

Can one calculate the derivative of some function using your method at
a certain point without knowledge of the analytical derivative?  I
guess that's the purpose.

> if you want to have the complete gradient, you will have to repeat N
> times. Each time for the same zero'th coefficients [x1,...,xN].
>
> Using the vecorized version, you would do only one propagation
> x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0,
> 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object)
> y = f(x)
>
> i.e. you save the overhead of calling the same function N times.

Ok, I understand.  Today it's too late, I will reason tomorrow about it.

>> Why don't you use multidimensional arrays?  Has it reasons in the C
>> accessibility?  Now, as I see it, you implement your strides manually.
>>  With a multidimensional array, you could even create arrays of shape
>> (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D)
>> or (D, 10, 12).
>
> the goal is to have several Taylor polynomials evaluated in the same
> base point, e.g.
> [x_0, x_{11}, x_{21}, x_{31}]
> [x_0, x_{12}, x_{22}, x_{32}]
> [x_0, x_{13}, x_{23}, x_{33}]
>
> i.e. P=3, D=3
> One could use an (P,D) array. However, one would do unnecessary
> computations since x_0 is the same for all P polynomials.
> I.e. one implements the data structure as
> [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}].
>
> This results in a non-const stride access.

No?  I think, the D axis stride shold be P with offset 1 and the P
axis stride 1?  Is there a specific reason to store it raveled?  And
not x_0 and ndarray(shape = (P, D - 1))?

>> Furthermore, I believe there is some elegant way formulating the
>> product for a single polynomial.  Let me think a bit ...
>>
>> For one entry of [z]_E, you want to sum up all pairs:
>>    x_{0} y{E} + ... + x{D} y{E - D} ,     (1)
>> right?  In a matrix, containing all mutual products x_{i} y{j}, this
>> are diagonals.  Rotating the matrix by 45° counterclockwise, they are
>> sums in columns.  Hmmm... how to rotate a matrix by 45°?
>>
>> Another fresh look: (1) looks pretty much like the discrete
>> convolution of x_{i} and y_{i} at argument E.  Going into Fourier
>> space, this becomes a product.  x_i and y_i have finite support,
>> because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D.  The support
>> of (1) in the running index is at most [0, D].  The support in E is at
>> most [0, 2 D].  Thus you don't make mistakes by using DFT, when you
>> pad x and y by D zeros on the right.  My strategy is: Push the padded
>> versions of x and y into Fourier space, calculate their product,
>> transform back, cut away the D last entries, and you're done!
>>
>> I guess you mainly want to parallelise the calculation instead of
>> improving the singleton calculation itself?  So then, even non-FFT
>> would incorporate 2 D explicit python sums for Fourier transform, and
>> again 2 D sums for transforming back, is this an improvement over the
>> method you are using currently?  And you could code it in pure Python
>> :-)
>>
>> I will investigate this tonight.  I'm curious.  Irrespective of that I
>> have also other fish to fry ... ;-)
>>
>> iirc, in fact DFT is used for efficient polynomial multiplication.
>> Maybe Chuck or another core developer of numpy can tell whether
>> numpy.polynomial does it actually that way?
>
> I believe the degree D is typically much to small (i.e. D <= 4) to
> justify the additional overhead of using FFT,
> though there may be cases when really high order polynomials are used.

I guess it's not overhead.  The number of calculations should be in
equilibrium at very low D, am I wrong?  And you win to not have to
compile a C library but only native text Python code.  E.g., your
optimisation package is quite interesting for me, but I'm using
Windows as my main system, so it will be painful to compile.  And the
code is more straightforward, more concise, easier to maintain and
easier to understand, ... :-)  I really do not want to diminish your
programming skills, please do not misunderstand!  I only mean the
subject.

Friedrich