[ANN]: Taylorpoly, an implementation of vectorized Taylor polynomial operations and request for opinions

Announcement: ----------------------- I have started to implement vectorized univariate truncated Taylor polynomial operations (add,sub,mul,div,sin,exp,...) in ANSI-C. The interface to python is implemented by using numpy.ndarray's ctypes functionality. Unit tests are implement using nose. It is BSD licencsed and hosted on github: http://github.com/b45ch1/taylorpoly Rationale: ------------------------ A truncated Taylor polynomial is of the form [x]_d = \sum_{d=0}^{D-1} x_d t^d where x_d a real number and t an external parameter. Truncated Taylor polynomials should not be confused with polynomials for interpolation, as it is implemented in numpy.polynomial. The difference is that Taylor polynomials, as implemented in `taylorpoly`, are polynomials in an "external parameter". I.e. they are *never* evaluated for some t. One should think of this algebraic class as an extension of the real numbers ( e.g. similarly to the complex numbers). At the moment, I am not aware of any such algorithms available in Python. I believe algorithms for this algebraic structure are a very important building block for more sophisticated algorithms, e.g. for Algorithmic Differentiation (AD) tools and Taylor polynomial integrators. More Detailed Description: --------------------------------------- The operations add,mul,etc. are extended to compute on truncated Taylor polynomials, e.g. for the multiplication [z]_{D+1} &=_{D+1}& [x]_{D+1} y_{D+1} \\ \sum_{d=0}^D z_d t^d & =_{D+1} & \left( \sum_{d=0}^D x_d t^d \right) \left( \sum_{c=0}^D y_c t^c \right) where D+1 is the degree of the polynomial. The coefficients z_d are only evaluated up to degree D+1, i.e. higher orders are truncated. Request for Opinions: ------------------------------- Before I continue implementing more algorithms, it would be nice to have some feedback. There are things I'm still not sure about: 1) data structure for vectorized operations: For the non-vectorized algorithms, it is quite clear that the coefficients of a Taylor polynomial \sum_{d=0}^{D-1} x_d t^d are stored in an 1D array [x_0,x_1,...,x_{D-1}]: In the vectorized version, P different Taylor polynomials are computed at once, but the base coefficient x_0 is the same for all of them. At the moment I have implemented the data structure: [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}]. Another possibility would be: [x] = [x_0, x_{1,1}, ..., x_{1,P}, x_{2, 1}, ..., x_{D-1, P}] Any thoughts about which to choose? The first version is much easier to implement. The second is possibly easier to vectorize by a compiler. 2) implementation of binary operators in Python: I have defined a class UTPS (univariate Taylor polynomial over Scalar) that basically only stores the above array [x]_{D,P} in the attribute `UTPS.data` and the algorithms add,sub,mul,div take instances of the class UTPS. I.e. the functions are not implemented as member functions. I plan to add member functions later for convenience that call those functions. Is this is a good idea? I think it would be good to be consistent with numpy. 3) coding style of the algorithms in C I'm making heavy use of pointer arithmetic, rendering the algorithms a little hard to write and to understand. The alternative would be array indexing with address computations. Does anyone know how good the compilers are nowadays at optimizing away the address computations? I'd be very happy to get some feedback. regards, Sebastian

To the core developers (of numpy.polynomial e.g.): Skip the mess and read the last paragraph. The other things I will post back to the list, where they belong to. I just didn't want to have off-topic discussion there.
I wanted to stress that one can do arithmetic on Taylor polynomials in a very similar was as with complex numbers.
I do not understand completely. What is the analogy with complex numbers which one cannot draw to real numbers? Or, more precise: What /is/ actually the analogy besides that there are operations? With i, i * i = -1, thus one has not to discard terms, contrary to the polynomial product as you defined it, no?
I guess there are also situations when you have polynomials z = \sum_{d=0}^{D-1} z_d t^d, where z_d are complex numbers.
I like it more to implement operators as overloads of the __mul__
I thought about something like def __mul__(self, other): return mul(self,other)
Yes, I know! And in fact, it may symmetrise the thing a bit.
etc., but this is a matter of taste. In fact, you /have/ to provide external binary operators, because I guess you also want to have numpy.ndarrays as left operand. In that case, the undarray will have higher precedence, and will treat your data structure as a scalar, applying it to all the undarray's elements.
well, actually it should treat it as a scalar since the Taylor polynomial is something like a real or complex number.
Maybe I misunderstood you completely, but didn't you want to implement arrays of polynomials using numpy? So I guess you want to apply a vector from numpy pairwise to the polynomials in the P-object?
[z]_{D+1} &=_{D+1}& [x]_{D+1} y_{D+1} \\ \sum_{d=0}^D z_d t^d & =_{D+1} & \left( \sum_{d=0}^D x_d t^d \right) \left( \sum_{c=0}^D y_c t^c \right)
Did your forget the [] around the y in line 1, or is this intentional? Actually, I had to compile the things before I could imagine what you could mean. Why don't you use multidimensional arrays? Has it reasons in the C accessibility? Now, as I see it, you implement your strides manually. With a multidimensional array, you could even create arrays of shape (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D) or (D, 10, 12). Just because of curiosity: Why do you set X_{0, 1} = ... X_{0, P} ? Furthermore, I believe there is some elegant way formulating the product for a single polynomial. Let me think a bit ... For one entry of [z]_E, you want to sum up all pairs: x_{0} y{E} + ... + x{D} y{E - D} , (1) right? In a matrix, containing all mutual products x_{i} y{j}, this are diagonals. Rotating the matrix by 45° counterclockwise, they are sums in columns. Hmmm... how to rotate a matrix by 45°? Another fresh look: (1) looks pretty much like the discrete convolution of x_{i} and y_{i} at argument E. Going into Fourier space, this becomes a product. x_i and y_i have finite support, because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D. The support of (1) in the running index is at most [0, D]. The support in E is at most [0, 2 D]. Thus you don't make mistakes by using DFT, when you pad x and y by D zeros on the right. My strategy is: Push the padded versions of x and y into Fourier space, calculate their product, transform back, cut away the D last entries, and you're done! I guess you mainly want to parallelise the calculation instead of improving the singleton calculation itself? So then, even non-FFT would incorporate 2 D explicit python sums for Fourier transform, and again 2 D sums for transforming back, is this an improvement over the method you are using currently? And you could code it in pure Python :-) I will investigate this tonight. I'm curious. Irrespective of that I have also other fish to fry ... ;-) iirc, in fact DFT is used for efficient polynomial multiplication. Maybe Chuck or another core developer of numpy can tell whether numpy.polynomial does it actually that way? Friedrich

2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
IMO this kind of discussion is not offtopic since it is directly related to the original question.
Ok, but I say it's not my responsibility now if the numpy-discussion namespace is polluted now.
2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
On Sat, Feb 27, 2010 at 3:59 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
I'm working currently on upy, uncertainty-python, dealing with real numbers. github.com/friedrichromstedt/upy . I want in mid-term extend it to complex numbers, where the concepts of "uncertainty" are necessarily more elaborate. Do you think the concept of truncated Taylor polynomials could help in understanding or even generalising the uncertainties? I'm not sure what you mean by uncertainties. Could you elaborate? For all I know you can use Taylor series for nonlinear error propagation.
I mean Gaussian error propagation. I currently am not intending to cover regimes where one has to consider "higher order" effects. If it is not very easy. On the contrary, I want to find a way to describe complex numbers which consist of a deterministic value and as many superposed "complex Gaussian variables" as needed.
what is a Gaussian variable? a formula would help ;)
Oh, I wasn't precise: en.wikipedia.org/wiki/Gaussian_random_variable (aka normal distribution)
Are complex numbers and truncated Taylor polynomials in some way isomorphic or something similar?
Taylor polynomials are not a field but just a commutative ring (there are zero divisors) so I guess it's not possible to find an isomorphism.
Ok.
The other things I will post back to the list, where they belong to. I just didn't want to have off-topic discussion there.
Friedrich

Ok, it took me about one hour, but here they are: Fourier-accelerated polynomials.
python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import gdft_polynomial p1 = gdft_polynomial.Polynomial([1]) p2 = gdft_polynomial.Polynomial([2]) p1 * p2 <gdft_polynomial.polynomial.Polynomial instance at 0x00E78A08> print p1 * p2 [ 2.+0.j] p1 = gdft_polynomial.Polynomial([1, 1]) p2 = gdft_polynomial.Polynomial([1]) print p1 * p2 [ 1. +6.12303177e-17j 1. -6.12303177e-17j] p2 = gdft_polynomial.Polynomial([1, 2]) print p1 * p2 [ 1. +8.51170986e-16j 3. +3.70074342e-17j 2. -4.44089210e-16j] p1 = gdft_polynomial.Polynomial([1, 2, 3, 4, 3, 2, 1]) p2 = gdft_polynomial.Polynomial([4, 3, 2, 1, 2, 3, 4]) print (p1 * p2).coefficients.real [ 4. 11. 20. 30. 34. 35. 36. 35. 34. 30. 20. 11. 4.]
github.com/friedrichromstedt/gdft_polynomials It's open for bug hunting :-) Haven't checked the last result. I used my own gdft module. Maybe one could incorporate numpy.fft easily. But that's your job, Sebastian, isn't it? Feel free to push to the repo, and don't forget to add your name to the copyright notice, hope you are happy with MIT. Anyway, I don't know whether numpy.fft supports transforming only one coordinate and using the others for "parallelisation"? Friedrich

On Sat, Feb 27, 2010 at 11:11 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Ok, it took me about one hour, but here they are: Fourier-accelerated polynomials.
that's the spirit! ;)
python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import gdft_polynomial p1 = gdft_polynomial.Polynomial([1]) p2 = gdft_polynomial.Polynomial([2]) p1 * p2 <gdft_polynomial.polynomial.Polynomial instance at 0x00E78A08> print p1 * p2 [ 2.+0.j] p1 = gdft_polynomial.Polynomial([1, 1]) p2 = gdft_polynomial.Polynomial([1]) print p1 * p2 [ 1. +6.12303177e-17j 1. -6.12303177e-17j] p2 = gdft_polynomial.Polynomial([1, 2]) print p1 * p2 [ 1. +8.51170986e-16j 3. +3.70074342e-17j 2. -4.44089210e-16j] p1 = gdft_polynomial.Polynomial([1, 2, 3, 4, 3, 2, 1]) p2 = gdft_polynomial.Polynomial([4, 3, 2, 1, 2, 3, 4]) print (p1 * p2).coefficients.real [ 4. 11. 20. 30. 34. 35. 36. 35. 34. 30. 20. 11. 4.]
github.com/friedrichromstedt/gdft_polynomials
It's open for bug hunting :-)
Haven't checked the last result.
looks correct
I used my own gdft module. Maybe one could incorporate numpy.fft easily. But that's your job, Sebastian, isn't it? Feel free to push to the repo, and don't forget to add your name to the copyright notice, hope you are happy with MIT.
i'll have a look at it.
Anyway, I don't know whether numpy.fft supports transforming only one coordinate and using the others for "parallelisation"?
Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
On Sat, Feb 27, 2010 at 11:11 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Ok, it took me about one hour, but here they are: Fourier-accelerated polynomials.
that's the spirit! ;)
Yes! I like it! :-)
python Python 2.4.1 (#65, Mar 30 2005, 09:13:57) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import gdft_polynomial p1 = gdft_polynomial.Polynomial([1]) p2 = gdft_polynomial.Polynomial([2]) p1 * p2 <gdft_polynomial.polynomial.Polynomial instance at 0x00E78A08> print p1 * p2 [ 2.+0.j] p1 = gdft_polynomial.Polynomial([1, 1]) p2 = gdft_polynomial.Polynomial([1]) print p1 * p2 [ 1. +6.12303177e-17j 1. -6.12303177e-17j] p2 = gdft_polynomial.Polynomial([1, 2]) print p1 * p2 [ 1. +8.51170986e-16j 3. +3.70074342e-17j 2. -4.44089210e-16j] p1 = gdft_polynomial.Polynomial([1, 2, 3, 4, 3, 2, 1]) p2 = gdft_polynomial.Polynomial([4, 3, 2, 1, 2, 3, 4]) print (p1 * p2).coefficients.real [ 4. 11. 20. 30. 34. 35. 36. 35. 34. 30. 20. 11. 4.]
github.com/friedrichromstedt/gdft_polynomials
It's open for bug hunting :-)
Haven't checked the last result. looks correct
We should check, simply using numpy.polynomial
I used my own gdft module. Maybe one could incorporate numpy.fft easily. But that's your job, Sebastian, isn't it? Feel free to push to the repo, and don't forget to add your name to the copyright notice, hope you are happy with MIT. i'll have a look at it.
I will be obliged.
Anyway, I don't know whether numpy.fft supports transforming only one coordinate and using the others for "parallelisation"?
I will check tomorrow. Suggestion: The other thread is the main thread, please reply there. (Gmane shows also the thread structure ...) If it's not related to this one ...

Sebastian, and, please, be not offended by what I wrote. I regret a bit my jokes ... It's simply too late at night. Friedrich

On Sun, Feb 28, 2010 at 12:47 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
Sebastian, and, please, be not offended by what I wrote. I regret a bit my jokes ... It's simply too late at night. no offense taken
Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Sat, Feb 27, 2010 at 10:02 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
To the core developers (of numpy.polynomial e.g.): Skip the mess and read the last paragraph.
The other things I will post back to the list, where they belong to. I just didn't want to have off-topic discussion there.
I wanted to stress that one can do arithmetic on Taylor polynomials in a very similar was as with complex numbers.
I do not understand completely. What is the analogy with complex numbers which one cannot draw to real numbers? Or, more precise: What /is/ actually the analogy besides that there are operations? With i, i * i = -1, thus one has not to discard terms, contrary to the polynomial product as you defined it, no?
I'm sorry this comment turns out to be confusing. It has apparently quite the contrary effect of what I wanted to achieve: Since there is already a polynomial module in numpy I wanted to highlight their difference and stress that they are used to do arithmetic, e.g. compute f([x],[y]) = [x] * (sin([x])**2 + [y]) in Taylor arithmetic.
I guess there are also situations when you have polynomials z = \sum_{d=0}^{D-1} z_d t^d, where z_d are complex numbers.
I like it more to implement operators as overloads of the __mul__
I thought about something like def __mul__(self, other): return mul(self,other)
Yes, I know! And in fact, it may symmetrise the thing a bit.
etc., but this is a matter of taste. In fact, you /have/ to provide external binary operators, because I guess you also want to have numpy.ndarrays as left operand. In that case, the undarray will have higher precedence, and will treat your data structure as a scalar, applying it to all the undarray's elements.
well, actually it should treat it as a scalar since the Taylor polynomial is something like a real or complex number.
Maybe I misunderstood you completely, but didn't you want to implement arrays of polynomials using numpy? So I guess you want to apply a vector from numpy pairwise to the polynomials in the P-object?
no, the vectorization is something different. It's purpose becomes only clear when applied in Algorithmic Differentiation. E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN] and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials: x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x) if you want to have the complete gradient, you will have to repeat N times. Each time for the same zero'th coefficients [x1,...,xN]. Using the vecorized version, you would do only one propagation x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0, 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object) y = f(x) i.e. you save the overhead of calling the same function N times.
[z]_{D+1} &=_{D+1}& [x]_{D+1} y_{D+1} \\ \sum_{d=0}^D z_d t^d & =_{D+1} & \left( \sum_{d=0}^D x_d t^d \right) \left( \sum_{c=0}^D y_c t^c \right)
Did your forget the [] around the y in line 1, or is this intentional? Actually, I had to compile the things before I could imagine what you could mean.
Yes, I'm sorry, this is a typo.
Why don't you use multidimensional arrays? Has it reasons in the C accessibility? Now, as I see it, you implement your strides manually. With a multidimensional array, you could even create arrays of shape (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D) or (D, 10, 12).
the goal is to have several Taylor polynomials evaluated in the same base point, e.g. [x_0, x_{11}, x_{21}, x_{31}] [x_0, x_{12}, x_{22}, x_{32}] [x_0, x_{13}, x_{23}, x_{33}] i.e. P=3, D=3 One could use an (P,D) array. However, one would do unnecessary computations since x_0 is the same for all P polynomials. I.e. one implements the data structure as [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}]. This results in a non-const stride access.
Just because of curiosity: Why do you set X_{0, 1} = ... X_{0, P} ?
Furthermore, I believe there is some elegant way formulating the product for a single polynomial. Let me think a bit ...
For one entry of [z]_E, you want to sum up all pairs: x_{0} y{E} + ... + x{D} y{E - D} , (1) right? In a matrix, containing all mutual products x_{i} y{j}, this are diagonals. Rotating the matrix by 45° counterclockwise, they are sums in columns. Hmmm... how to rotate a matrix by 45°?
Another fresh look: (1) looks pretty much like the discrete convolution of x_{i} and y_{i} at argument E. Going into Fourier space, this becomes a product. x_i and y_i have finite support, because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D. The support of (1) in the running index is at most [0, D]. The support in E is at most [0, 2 D]. Thus you don't make mistakes by using DFT, when you pad x and y by D zeros on the right. My strategy is: Push the padded versions of x and y into Fourier space, calculate their product, transform back, cut away the D last entries, and you're done!
I guess you mainly want to parallelise the calculation instead of improving the singleton calculation itself? So then, even non-FFT would incorporate 2 D explicit python sums for Fourier transform, and again 2 D sums for transforming back, is this an improvement over the method you are using currently? And you could code it in pure Python :-)
I will investigate this tonight. I'm curious. Irrespective of that I have also other fish to fry ... ;-)
iirc, in fact DFT is used for efficient polynomial multiplication. Maybe Chuck or another core developer of numpy can tell whether numpy.polynomial does it actually that way?
I believe the degree D is typically much to small (i.e. D <= 4) to justify the additional overhead of using FFT, though there may be cases when really high order polynomials are used.
Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
I'm sorry this comment turns out to be confusing.
Maybe it's not important.
It has apparently quite the contrary effect of what I wanted to achieve: Since there is already a polynomial module in numpy I wanted to highlight their difference and stress that they are used to do arithmetic, e.g. compute
f([x],[y]) = [x] * (sin([x])**2 + [y])
in Taylor arithmetic.
That's cool! You didn't mention that. Now I step by step find out what your module (package?) is for. You are a mathematician? Many physicists complain that mathematicians cannot make their point ;-) I think I can use that to make my upy accept arbitrary functions, but how do you apply sin() to a TTP? One more question: You said, t is an "external paramter". I, and maybe not only me, interpreted this as a complicated name for "variable". So I assumed it will be a parameter to some method of the TTP. But it isn't? It's just the way to define the ring? You could define it the same in Fourier space, except that you have to make the array large enough from the beginning? Why not doing that, and saying, your computation relies on the Fourier transform of the representation? Can this give insight why TTPs are a ring and why they have zero divisors?
In fact, you /have/ to provide external binary operators, because I guess you also want to have numpy.ndarrays as left operand. In that case, the undarray will have higher precedence, and will treat your data structure as a scalar, applying it to all the undarray's elements.
well, actually it should treat it as a scalar since the Taylor polynomial is something like a real or complex number.
Maybe I misunderstood you completely, but didn't you want to implement arrays of polynomials using numpy? So I guess you want to apply a vector from numpy pairwise to the polynomials in the P-object?
no, the vectorization is something different. It's purpose becomes only clear when applied in Algorithmic Differentiation.
Hey folks, here's a cool package, but the maintainer didn't tell us! ;-)
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
So what is the result of applying f to some UTPS instance, is it a plain number, is an UTPS again? How do you calculate? Can one calculate the derivative of some function using your method at a certain point without knowledge of the analytical derivative? I guess that's the purpose.
if you want to have the complete gradient, you will have to repeat N times. Each time for the same zero'th coefficients [x1,...,xN].
Using the vecorized version, you would do only one propagation x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0, 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object) y = f(x)
i.e. you save the overhead of calling the same function N times.
Ok, I understand. Today it's too late, I will reason tomorrow about it.
Why don't you use multidimensional arrays? Has it reasons in the C accessibility? Now, as I see it, you implement your strides manually. With a multidimensional array, you could even create arrays of shape (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D) or (D, 10, 12).
the goal is to have several Taylor polynomials evaluated in the same base point, e.g. [x_0, x_{11}, x_{21}, x_{31}] [x_0, x_{12}, x_{22}, x_{32}] [x_0, x_{13}, x_{23}, x_{33}]
i.e. P=3, D=3 One could use an (P,D) array. However, one would do unnecessary computations since x_0 is the same for all P polynomials. I.e. one implements the data structure as [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}].
This results in a non-const stride access.
No? I think, the D axis stride shold be P with offset 1 and the P axis stride 1? Is there a specific reason to store it raveled? And not x_0 and ndarray(shape = (P, D - 1))?
Furthermore, I believe there is some elegant way formulating the product for a single polynomial. Let me think a bit ...
For one entry of [z]_E, you want to sum up all pairs: x_{0} y{E} + ... + x{D} y{E - D} , (1) right? In a matrix, containing all mutual products x_{i} y{j}, this are diagonals. Rotating the matrix by 45° counterclockwise, they are sums in columns. Hmmm... how to rotate a matrix by 45°?
Another fresh look: (1) looks pretty much like the discrete convolution of x_{i} and y_{i} at argument E. Going into Fourier space, this becomes a product. x_i and y_i have finite support, because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D. The support of (1) in the running index is at most [0, D]. The support in E is at most [0, 2 D]. Thus you don't make mistakes by using DFT, when you pad x and y by D zeros on the right. My strategy is: Push the padded versions of x and y into Fourier space, calculate their product, transform back, cut away the D last entries, and you're done!
I guess you mainly want to parallelise the calculation instead of improving the singleton calculation itself? So then, even non-FFT would incorporate 2 D explicit python sums for Fourier transform, and again 2 D sums for transforming back, is this an improvement over the method you are using currently? And you could code it in pure Python :-)
I will investigate this tonight. I'm curious. Irrespective of that I have also other fish to fry ... ;-)
iirc, in fact DFT is used for efficient polynomial multiplication. Maybe Chuck or another core developer of numpy can tell whether numpy.polynomial does it actually that way?
I believe the degree D is typically much to small (i.e. D <= 4) to justify the additional overhead of using FFT, though there may be cases when really high order polynomials are used.
I guess it's not overhead. The number of calculations should be in equilibrium at very low D, am I wrong? And you win to not have to compile a C library but only native text Python code. E.g., your optimisation package is quite interesting for me, but I'm using Windows as my main system, so it will be painful to compile. And the code is more straightforward, more concise, easier to maintain and easier to understand, ... :-) I really do not want to diminish your programming skills, please do not misunderstand! I only mean the subject. Friedrich

On Sun, Feb 28, 2010 at 12:30 AM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/2/27 Sebastian Walter <sebastian.walter@gmail.com>:
I'm sorry this comment turns out to be confusing.
Maybe it's not important.
It has apparently quite the contrary effect of what I wanted to achieve: Since there is already a polynomial module in numpy I wanted to highlight their difference and stress that they are used to do arithmetic, e.g. compute
f([x],[y]) = [x] * (sin([x])**2 + [y])
in Taylor arithmetic.
That's cool! You didn't mention that. Now I step by step find out what your module (package?) is for. You are a mathematician? Many physicists complain that mathematicians cannot make their point ;-)
I studied physics but switchted to applied maths.
I think I can use that to make my upy accept arbitrary functions, but how do you apply sin() to a TTP?
perform truncated Taylor expansion of [y]_D = sin([x]_D), i.e. y_d = d^d/dt^d sin( \sum_{k=0}^{D-1} x_k t^k) |_{t=0} to obtain an explicit algorithm.
One more question: You said, t is an "external paramter". I, and maybe not only me, interpreted this as a complicated name for "variable". So I assumed it will be a parameter to some method of the TTP. But it isn't? It's just the way to define the ring? You could define it the same in Fourier space, except that you have to make the array large enough from the beginning? Why not doing that, and saying, your computation relies on the Fourier transform of the representation? Can this give insight why TTPs are a ring and why they have zero divisors?
it has zero divisors because for instance multiplication of the two polynomials t*t^{D-1} truncated at t^{D-1} yields is zero.
In fact, you /have/ to provide external binary operators, because I guess you also want to have numpy.ndarrays as left operand. In that case, the undarray will have higher precedence, and will treat your data structure as a scalar, applying it to all the undarray's elements.
well, actually it should treat it as a scalar since the Taylor polynomial is something like a real or complex number.
Maybe I misunderstood you completely, but didn't you want to implement arrays of polynomials using numpy? So I guess you want to apply a vector from numpy pairwise to the polynomials in the P-object?
no, the vectorization is something different. It's purpose becomes only clear when applied in Algorithmic Differentiation.
Hey folks, here's a cool package, but the maintainer didn't tell us! ;-)
well, thanks :)
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
So what is the result of applying f to some UTPS instance, is it a plain number, is an UTPS again? How do you calculate?
Can one calculate the derivative of some function using your method at a certain point without knowledge of the analytical derivative? I guess that's the purpose.
Yes, that's the whole point: Obtaining (higher order) derivatives of functions at machine precision for which no symbolic representation is readily available. That includes computer codes with recursions (e.g. for loops) that are a no-go for symbolic differentiation. Supposedly (I've never done that) you can even differentiate Monte Carlo simulations in that way.
if you want to have the complete gradient, you will have to repeat N times. Each time for the same zero'th coefficients [x1,...,xN].
Using the vecorized version, you would do only one propagation x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0, 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object) y = f(x)
i.e. you save the overhead of calling the same function N times.
Ok, I understand. Today it's too late, I will reason tomorrow about it.
Why don't you use multidimensional arrays? Has it reasons in the C accessibility? Now, as I see it, you implement your strides manually. With a multidimensional array, you could even create arrays of shape (10, 12) of D-polynomials by storing an ndarray of shape (10, 12, D) or (D, 10, 12).
the goal is to have several Taylor polynomials evaluated in the same base point, e.g. [x_0, x_{11}, x_{21}, x_{31}] [x_0, x_{12}, x_{22}, x_{32}] [x_0, x_{13}, x_{23}, x_{33}]
i.e. P=3, D=3 One could use an (P,D) array. However, one would do unnecessary computations since x_0 is the same for all P polynomials. I.e. one implements the data structure as [x]_{D,P} := [x_0, x_{1,1},...,x_{1,D-1},x_{2,1},...,x_{P,D-1}].
This results in a non-const stride access.
No? I think, the D axis stride shold be P with offset 1 and the P axis stride 1? Is there a specific reason to store it raveled? And not x_0 and ndarray(shape = (P, D - 1))?
1) cosmetic reasons 2) easier to interface C to Python.
Furthermore, I believe there is some elegant way formulating the product for a single polynomial. Let me think a bit ...
For one entry of [z]_E, you want to sum up all pairs: x_{0} y{E} + ... + x{D} y{E - D} , (1) right? In a matrix, containing all mutual products x_{i} y{j}, this are diagonals. Rotating the matrix by 45° counterclockwise, they are sums in columns. Hmmm... how to rotate a matrix by 45°?
Another fresh look: (1) looks pretty much like the discrete convolution of x_{i} and y_{i} at argument E. Going into Fourier space, this becomes a product. x_i and y_i have finite support, because one sets x_{i} and y_{i} = 0 outside 0 <= i <= D. The support of (1) in the running index is at most [0, D]. The support in E is at most [0, 2 D]. Thus you don't make mistakes by using DFT, when you pad x and y by D zeros on the right. My strategy is: Push the padded versions of x and y into Fourier space, calculate their product, transform back, cut away the D last entries, and you're done!
I guess you mainly want to parallelise the calculation instead of improving the singleton calculation itself? So then, even non-FFT would incorporate 2 D explicit python sums for Fourier transform, and again 2 D sums for transforming back, is this an improvement over the method you are using currently? And you could code it in pure Python :-)
I will investigate this tonight. I'm curious. Irrespective of that I have also other fish to fry ... ;-)
iirc, in fact DFT is used for efficient polynomial multiplication. Maybe Chuck or another core developer of numpy can tell whether numpy.polynomial does it actually that way?
I believe the degree D is typically much to small (i.e. D <= 4) to justify the additional overhead of using FFT, though there may be cases when really high order polynomials are used.
I guess it's not overhead. The number of calculations should be in equilibrium at very low D, am I wrong? And you win to not have to compile a C library but only native text Python code.
Well, I'm really no expert on the DFT. But doesn't the DFT compute on the complex numbers? you'll have extra overhead (let's say factor >= 2?) And as far as I can tell, you do the computations on padded arrays which possibly introduces cache misses (maybe another factor 2?) Isn't the advantage of the DFT not that you can use the FFT which would reduce the runtime from O(D^2) to O(D log(D))? I'm pretty sure that only pays off for D larger than 10.
E.g., your optimisation package is quite interesting for me, but I'm using Windows as my main system, so it will be painful to compile. And the code is more straightforward, more concise, easier to maintain and easier to understand, ... :-) I really do not want to diminish your programming skills, please do not misunderstand! I only mean the subject. The project uses scons which is available for windows as binaries. I haven't tried it myself but I'm confident that it's a 1 minutes job on windows.
I have implemented some of the algorithms just as you explained in another package (http://github.com/b45ch1/algopy/blob/master/algopy/utp/utps.py). But I don't think the code looks easier to maintain than the C code and it's also slower.
Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

2010/2/28 Sebastian Walter <sebastian.walter@gmail.com>:
I think I can use that to make my upy accept arbitrary functions, but how do you apply sin() to a TTP?
perform truncated Taylor expansion of [y]_D = sin([x]_D), i.e. y_d = d^d/dt^d sin( \sum_{k=0}^{D-1} x_k t^k) |_{t=0} to obtain an explicit algorithm.
One more question: You said, t is an "external paramter". I, and maybe not only me, interpreted this as a complicated name for "variable". So I assumed it will be a parameter to some method of the TTP. But it isn't? It's just the way to define the ring?
I guess you overlooked this question?
You could define it the same in Fourier space, except that you have to make the array large enough from the beginning? Why not doing that, and saying, your computation relies on the Fourier transform of the representation? Can this give insight why TTPs are a ring and why they have zero divisors? it has zero divisors because for instance multiplication of the two polynomials t*t^{D-1} truncated at t^{D-1} yields is zero.
Yes, but I wanted to have a look from Fourier space of view. Because there everything is just a multiplication, and one does not have to perform the convolution in mind. I have to give up here. In fact, I do not really understand why my approach also works with DFT and not only analytically with steady FT. Consider the snippet:
p1 = gdft_polynomials.Polynomial([1]) p1.get_dft(3) array([ 1.+0.j, 1.+0.j, 1.+0.j]) p2 = gdft_polynomials.Polynomial([0, 1]) p2.get_dft(3) array([ 1.0+0.j , -0.5+0.8660254j, -0.5-0.8660254j]) p2.get_dft(4) array([ 1.00000000e+00 +0.00000000e+00j, 6.12303177e-17 +1.00000000e+00j, -1.00000000e+00 +1.22460635e-16j, -1.83690953e-16 -1.00000000e+00j])
As one increases the number of padding zeros, one increases Fourier space resolution, without affecting result:
p3 = gdft_polynomials.Polynomial([0, 1, 0, 0, 0]) print p2 * p2 Polynomial(real part = [ 1.85037171e-16 -7.40148683e-17 1.00000000e+00] imaginary part = [ 2.59052039e-16 7.40148683e-17 0.00000000e+00]) print p2 * p3 Polynomial(real part = [ 1.66533454e-16 1.48029737e-16 1.00000000e+00 -7.40148683e-17 -4.44089210e-16 -3.70074342e-17] imaginary part = [ 9.25185854e-17 1.48029737e-16 2.96059473e-16 1.11022302e-16 -3.70074342e-16 -1.44497045e-16])
It's a bit of mystery to me. Of course, one can argue, well, DFT is information maintaining, and thus one can "feel" that it should work, but this is only a gut feeling.
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
So what is the result of applying f to some UTPS instance, is it a plain number, is an UTPS again? How do you calculate?
Can one calculate the derivative of some function using your method at a certain point without knowledge of the analytical derivative? I guess that's the purpose. Yes, that's the whole point: Obtaining (higher order) derivatives of functions at machine precision for which no symbolic representation is readily available. That includes computer codes with recursions (e.g. for loops) that are a no-go for symbolic differentiation. Supposedly (I've never done that) you can even differentiate Monte Carlo simulations in that way.
http://en.wikipedia.org/wiki/Automatic_differentiation: "Both classical methods have problems with calculating higher derivatives, where the complexity and errors increase. Finally, both classical methods are slow at computing the partial derivatives of a function with respect to many inputs, as is needed for gradient-based optimization algorithms. Automatic differentiation solves all of these problems." Yeah. Note at this point, that there is no chance for your project to be integrated in scipy, because you maybe HAVE TO PUBLISH UNDER GPL/CPAL (the ADOL-C is licensed GPL or CPAL). I cannot find CPL on www.opensource.org, but I guess it has been renamed to CPAL? Anyway, CPAL looks long enough to be GPL style ;-). I also published my projects under GPL first, and switched now to MIT, because Python, numpy, scipy, matplotlib, ... are published under BSD kind too, and in fact I like MIT/BSD more. Please check if your aren't violating GPL style licenses with publishing under BSD style.
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
But doesn't the call f(x) with x.shape = (N,) result in an array too? But you want a scalar number?
if you want to have the complete gradient, you will have to repeat N times. Each time for the same zero'th coefficients [x1,...,xN].
Using the vecorized version, you would do only one propagation x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0, 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object) y = f(x)
i.e. you save the overhead of calling the same function N times.
Ok, I understand. Today it's too late, I will reason tomorrow about it.
I think I grasped the idea. But the thing is really tricky. I thought: So UTPS is not the thing you implemented, but you implemented rather the complete array. Right? But it's maybe wrong: UTPS([x1_0, 1, 0, ..., 0]) is with D = 1 and P = N (f: R^N -> R). I.e., P = N polynomials of degree 1, for calculating the first-order derivative? That's why your question (1) from Feb 27: What to hand over? I would say, make it possible to hand over an (P, N) ndarray. It will increase impact of your module (and graspableness) dramatically. And you have an indication how to interpret the array handed over without additional init args to UTPS(). I think the nominal value is always stored in x_{i, 0}, am I wrong? I'm not shure what to use as initial UTPSs. Can you make it possible that one doesn't have to think about that? What would be great, if I have a target function f(a, b, c, ...) stored, to hand over instead of ordinary numbers objects from your package, and everything works out such that I end up in the result with an object where both the nominal value is stored as also the gradient. Is that feasible? You could also monkey-patch numpy.sin etc. by a replacement calling the original numpy.sin with the nominal values but also doing the ttp job.
I guess it's not overhead. The number of calculations should be in equilibrium at very low D, am I wrong? And you win to not have to compile a C library but only native text Python code.
Well, I'm really no expert on the DFT. But doesn't the DFT compute on the complex numbers? you'll have extra overhead (let's say factor >= 2?) And as far as I can tell, you do the computations on padded arrays which possibly introduces cache misses (maybe another factor 2?)
What are "cache misses"?
Isn't the advantage of the DFT not that you can use the FFT which would reduce the runtime from O(D^2) to O(D log(D))? I'm pretty sure that only pays off for D larger than 10.
Your algorithm stays at O(D^2) as you do the convolution by hand, no?
E.g., your optimisation package is quite interesting for me, but I'm using Windows as my main system, so it will be painful to compile. And the code is more straightforward, more concise, easier to maintain and easier to understand, ... :-) I really do not want to diminish your programming skills, please do not misunderstand! I only mean the subject. The project uses scons which is available for windows as binaries. I haven't tried it myself but I'm confident that it's a 1 minutes job on windows.
The optimisation package or utp? I want to give utp a try.
I have implemented some of the algorithms just as you explained in another package (http://github.com/b45ch1/algopy/blob/master/algopy/utp/utps.py). But I don't think the code looks easier to maintain than the C code and it's also slower.
Can you explain which repo I should clone at best? What are the differences between algopy, taylorpoly and pyadolc? I'm getting a bit confused slowly with all the mails in this thread, and also the subject isn't that easy ... http://en.wikipedia.org/wiki/Automatic_differentiation also refers to TTP only in a marginal note :-( Friedrich

On Sun, Feb 28, 2010 at 9:06 PM, Friedrich Romstedt <friedrichromstedt@gmail.com> wrote:
2010/2/28 Sebastian Walter <sebastian.walter@gmail.com>:
I think I can use that to make my upy accept arbitrary functions, but how do you apply sin() to a TTP?
perform truncated Taylor expansion of [y]_D = sin([x]_D), i.e. y_d = d^d/dt^d sin( \sum_{k=0}^{D-1} x_k t^k) |_{t=0} to obtain an explicit algorithm.
One more question: You said, t is an "external paramter". I, and maybe not only me, interpreted this as a complicated name for "variable". So I assumed it will be a parameter to some method of the TTP. But it isn't? It's just the way to define the ring?
I guess you overlooked this question?
thought this is a rhetorical question. Tbh I don't know what the standard name for such a "formal variable" is.
You could define it the same in Fourier space, except that you have to make the array large enough from the beginning? Why not doing that, and saying, your computation relies on the Fourier transform of the representation? Can this give insight why TTPs are a ring and why they have zero divisors? it has zero divisors because for instance multiplication of the two polynomials t*t^{D-1} truncated at t^{D-1} yields is zero.
Yes, but I wanted to have a look from Fourier space of view. Because there everything is just a multiplication, and one does not have to perform the convolution in mind. I have to give up here. In fact, I do not really understand why my approach also works with DFT and not only analytically with steady FT. Consider the snippet:
p1 = gdft_polynomials.Polynomial([1]) p1.get_dft(3) array([ 1.+0.j, 1.+0.j, 1.+0.j]) p2 = gdft_polynomials.Polynomial([0, 1]) p2.get_dft(3) array([ 1.0+0.j , -0.5+0.8660254j, -0.5-0.8660254j]) p2.get_dft(4) array([ 1.00000000e+00 +0.00000000e+00j, 6.12303177e-17 +1.00000000e+00j, -1.00000000e+00 +1.22460635e-16j, -1.83690953e-16 -1.00000000e+00j])
As one increases the number of padding zeros, one increases Fourier space resolution, without affecting result:
p3 = gdft_polynomials.Polynomial([0, 1, 0, 0, 0]) print p2 * p2 Polynomial(real part = [ 1.85037171e-16 -7.40148683e-17 1.00000000e+00] imaginary part = [ 2.59052039e-16 7.40148683e-17 0.00000000e+00]) print p2 * p3 Polynomial(real part = [ 1.66533454e-16 1.48029737e-16 1.00000000e+00 -7.40148683e-17 -4.44089210e-16 -3.70074342e-17] imaginary part = [ 9.25185854e-17 1.48029737e-16 2.96059473e-16 1.11022302e-16 -3.70074342e-16 -1.44497045e-16])
It's a bit of mystery to me. Of course, one can argue, well, DFT is information maintaining, and thus one can "feel" that it should work, but this is only a gut feeling.
I'm of no help here, I'm not familiar enough with the DFT. All I know is that F( conv(x,y)) = F(x) * F(y) and that one can speed up the convolution in that way. And most operations on truncated Taylor polynomials result in algorithms that contain convolutions.
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
So what is the result of applying f to some UTPS instance, is it a plain number, is an UTPS again? How do you calculate?
Can one calculate the derivative of some function using your method at a certain point without knowledge of the analytical derivative? I guess that's the purpose. Yes, that's the whole point: Obtaining (higher order) derivatives of functions at machine precision for which no symbolic representation is readily available. That includes computer codes with recursions (e.g. for loops) that are a no-go for symbolic differentiation. Supposedly (I've never done that) you can even differentiate Monte Carlo simulations in that way.
http://en.wikipedia.org/wiki/Automatic_differentiation: "Both classical methods have problems with calculating higher derivatives, where the complexity and errors increase. Finally, both classical methods are slow at computing the partial derivatives of a function with respect to many inputs, as is needed for gradient-based optimization algorithms. Automatic differentiation solves all of these problems."
Yeah.
Note at this point, that there is no chance for your project to be integrated in scipy, because you maybe HAVE TO PUBLISH UNDER GPL/CPAL (the ADOL-C is licensed GPL or CPAL). I cannot find CPL on www.opensource.org, but I guess it has been renamed to CPAL? Anyway, CPAL looks long enough to be GPL style ;-). I also published my projects under GPL first, and switched now to MIT, because Python, numpy, scipy, matplotlib, ... are published under BSD kind too, and in fact I like MIT/BSD more. Please check if your aren't violating GPL style licenses with publishing under BSD style.
AFAIK the CPL is basically the Eclipse Public License ( http://en.wikipedia.org/wiki/Eclipse_Public_License) which explicitly allows software under another licence to link to CPL code. The Python wrapper of ADOL-C is only linking to ADOL-C thus it can be BSD licensed. How is this related to the Taylor polynomials?
E.g. if you have a function f: R^N -> R x -> y=f(x) where x = [x1,...,xN]
and you want to compute the gradient g(x) of f(x), then you can compute df(x)/dxn by propagating the following array of Taylor polynomials:
x = numpy.array( UTPS([x1_0, 0]), ..., UTPS([xn_0, 1]), ..., UTPS([xN_0,0]), dtype=object) y = f(x)
But doesn't the call f(x) with x.shape = (N,) result in an array too? But you want a scalar number?
Since x[0] is a scalar and not a 0-dimensional ndarray I conjecture that a function mapping to the real numbers should be implemented to return a scalar, not an array.
if you want to have the complete gradient, you will have to repeat N times. Each time for the same zero'th coefficients [x1,...,xN].
Using the vecorized version, you would do only one propagation x = numpy.array( UTPS([x1_0, 1,0,...,0]), ..., UTPS([xn_0, 0,...,1,...0]), ..., UTPS([xN_0,0,....,1]), dtype=object) y = f(x)
i.e. you save the overhead of calling the same function N times.
Ok, I understand. Today it's too late, I will reason tomorrow about it.
I think I grasped the idea. But the thing is really tricky.
I thought: So UTPS is not the thing you implemented, but you implemented rather the complete array. Right?
I don't get your question.
But it's maybe wrong:
UTPS([x1_0, 1, 0, ..., 0]) is with D = 1 and P = N (f: R^N -> R). I.e., P = N polynomials of degree 1, for calculating the first-order derivative? That's why your question (1) from Feb 27: What to hand over? I would say, make it possible to hand over an (P, N) ndarray. It will increase impact of your module (and graspableness) dramatically. And you have an indication how to interpret the array handed over without additional init args to UTPS().
Well, there is a problem with the (P,D) approach. What if there is an comparison operator, e.g. [x]_D > 0 in your code? It is defined as x_0 > 0. Now, if a single UTPS contains P zero coefficients x_{10},...,x_{P0} then at this point the computer program would have to branch. Hence there must be only one x_0 in an UTPS instance. If it wasn't for this branching I'd implemented it as (P,D) array. I thought about providing the possibility to the user to use an (P,D) array as input. However, this would mean to hide crucial information about the inner workings of the algorithms from the users. This is almost always a very bad idea. Also, it would violate the "only one way to do it" principle of Python. As Einstein said: Make things as simple as possible, but not any simpler ;) But it's good that you point that out. I agree that it would be nice to be more elegant.
I think the nominal value is always stored in x_{i, 0}, am I wrong?
there is only one zero'th coefficient x_0 for all directions P.
I'm not shure what to use as initial UTPSs. Can you make it possible that one doesn't have to think about that? What would be great, if I have a target function f(a, b, c, ...) stored, to hand over instead of ordinary numbers objects from your package, and everything works out such that I end up in the result with an object where both the nominal value is stored as also the gradient. Is that feasible? You could also monkey-patch numpy.sin etc. by a replacement calling the original numpy.sin with the nominal values but also doing the ttp job.
The taylorpoly module is supposed to be a basic building block for AD tools and other software that relies on local polynomial approximations (e.g. Taylor series intergrators for ODEs/DAEs). There are quite a lot of AD tools available in Python. I use pyadolc on a regular basis and it turned out to be very reliable so far. It is also quite fast (probably factor 100 faster than pure Python based AD tools). numpy.sin(x) is smart enough to call x.sin() if x is an object.
I guess it's not overhead. The number of calculations should be in equilibrium at very low D, am I wrong? And you win to not have to compile a C library but only native text Python code.
Well, I'm really no expert on the DFT. But doesn't the DFT compute on the complex numbers? you'll have extra overhead (let's say factor >= 2?) And as far as I can tell, you do the computations on padded arrays which possibly introduces cache misses (maybe another factor 2?)
What are "cache misses"?
now that I think of it, the size of the polynomials are much to small not to fit into the cache... A cache miss occurs when a required memory block for the next operation on the CPU is not in the cache and has to be transfered from the main memory. Getting from the main memory is a process with high latency and relatively low bandwidth. I.e. the cpu is waiting, i.e. your algorithm operates significantly below peak performance. But anyway, forget about it. In this case it should be irrelevant.
Isn't the advantage of the DFT not that you can use the FFT which would reduce the runtime from O(D^2) to O(D log(D))? I'm pretty sure that only pays off for D larger than 10.
Your algorithm stays at O(D^2) as you do the convolution by hand, no?
Yes, its O(D^2).
E.g., your optimisation package is quite interesting for me, but I'm using Windows as my main system, so it will be painful to compile. And the code is more straightforward, more concise, easier to maintain and easier to understand, ... :-) I really do not want to diminish your programming skills, please do not misunderstand! I only mean the subject. The project uses scons which is available for windows as binaries. I haven't tried it myself but I'm confident that it's a 1 minutes job on windows.
The optimisation package or utp? I want to give utp a try.
to what do you refer to by "optimization package"?
I have implemented some of the algorithms just as you explained in another package (http://github.com/b45ch1/algopy/blob/master/algopy/utp/utps.py). But I don't think the code looks easier to maintain than the C code and it's also slower.
Can you explain which repo I should clone at best? What are the differences between algopy, taylorpoly and pyadolc?
pyadolc is the wrapper of the very mature C++ software ADOL-C. It's quite fast and reliable. algopy is a research prototype with focus on higher order differentation of linear algebra functions (dot, inv, qr, eigh, trace, det, ...). This is done by propagation of univariate Taylor polynomials with numpy.arrays as coefficients. taylorpoly is supposed to provide building blocks. E.g. I plan to use these algorithms also in algopy. I hope that the software I write is deemed valuable enough to be included into scipy or preferably numpy.
I'm getting a bit confused slowly with all the mails in this thread, and also the subject isn't that easy ... http://en.wikipedia.org/wiki/Automatic_differentiation also refers to TTP only in a marginal note :-(
One could/should improve the wikipedia article, I guess.
Friedrich _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (2)
-
Friedrich Romstedt
-
Sebastian Walter