Am 09.03.2015 um 21:05 schrieb Nikolay Mayorov <n59_ru@hotmail.com>:
Gregor, many thanks for your input. I noted a few useful points:
1) The option to solve the normal equation directly is indeed useful when m >> n. 2) I read the tech. report and the approach looks really good considering how often the approximation is sought as the linear combination of basis functions (which in turn depend on tunable parameters).
I didn't understand the reason to store Jacobian in transposed form. What significant difference will it make?
This is only a minor optimization that improves the memory access pattern. I forgot the details, but also the current scipy leastsq offers the possibility (see the col_deriv argument, default off) to switch to transposed storage to improve performance. Linear algebra (qr, dot) is faster with Fortran contiguous arrays, or C contiguous arrays with transposed storage. A simple example to show that the memory access pattern can make a big difference: In [30]: a = arange(10000000) In [31]: b = a[::5] In [32]: c = a[:2000000] In [33]: %timeit dot(b,b) 100 loops, best of 3: 3.81 ms per loop In [34]: %timeit dot(c,c) 1000 loops, best of 3: 1.09 ms per loop And I discovered that the code usually gets simpler. All this might be irrelevant for small problems or functions that are expensive to compute. Gregor