Hello, My name is Brian Newsom and I am just beginning to use/work on the scipy library, specifically the quadpack within the integration pack. Because of the recursion and wrapping methods used now, running nquad (scipy 0.13) function is slow-the fortran and python are nested in a way that requires switching back and forth between the two often. Ideally this inefficiency could be overcome by translating the main interface (quadpack.py) into fortran and just having a minimal python wrapper that is not used in the recursion. First off, has anyone attempted (or is currently attempting) this? Are there any major difficulties I should be wary of or is this unrealistic for any reason? Additionally, I was curious if there are any computational advantages to using C or Fortran for this purpose within this library, either is a possibility in the case that one is strongly preferred. Otherwise, I am sure I will be full of questions in the near future with specific regards to the building and setup of these files within the scipy library. Thank you, Brian Newsom Brian.Newsom@colorado.edu
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 08.09.2013 07:43, Brian Lee Newsom kirjoitti:
My name is Brian Newsom and I am just beginning to use/work on the scipy library, specifically the quadpack within the integration pack. Because of the recursion and wrapping methods used now, running nquad (scipy 0.13) function is slow-the fortran and python are nested in a way that requires switching back and forth between the two often. Ideally this inefficiency could be overcome by translating the main interface (quadpack.py) into fortran and just having a minimal python wrapper that is not used in the recursion.
The first question: did you measure the overhead involved in the Python code, and determined that it dominates the computation time? If not, that would be the first thing to do, otherwise it's possible to end up with a wild goose chase. On the face of it, it does not sound likely to me that much will be gained by writing the recursion in a lower-level language. Suppose that each level requires N >> 1 evaluations of the function. The innermost function will be called in total N^3 times, the next level N^2 times and the outer level N times. The inner level involves N^3 calls back to Python, which are not removed by writing the recursion in Fortran. Since N^3 >> N^2, the innermost callback overhead dominates the run time. Of course, the N^2 term has a bigger prefactor than N^3, so this argument is not fully convincing, but it would be best to measure first. IIRC, the `quad` integration routine can directly call ctypes function pointers, skipping Python overhead. (See the Python ctypes module documentation for details on how to define them.) This requires Scipy
= 0.11.0, however.
There have been plans to improve the interoperability further (with Cython, f2py, and other ways) to interact directly with low-level code in Scipy's computational routines. Putting effort into this direction and trying to speed up the innermost part would seem to me the most sensible direction of further work. Of course, the above is written assuming your benchmarks don't show me wrong. I don't believe anyone is currently working on `nquad`, so go ahead. However, note that due to the current situation with Mingw Gfortran compiler on Windows, there may be issues in accepting Fortran 90 code. Best regards, - -- Pauli Virtanen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlIs1a8ACgkQ6BQxb7O0pWBYtwCdHrXw+wXD4UMf9CLzHHe3hkpM V6QAniW2X6HeC4YWDHnDJwEZ5suxnXvi =3ptO -----END PGP SIGNATURE-----
I took a look at the quad routine, and it appears that its ctypes integration is limited to single-variable functions. I'm not totally sure why that would be, but it's a bit crippling for anything that includes extra args, as well as any attempt at multidimensional integrals. Can anyone confirm this? I'm pretty new at the guts of SciPy and C in general. Nathan Woods On Sun, Sep 8, 2013 at 1:53 PM, Pauli Virtanen <pav@iki.fi> wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
08.09.2013 07:43, Brian Lee Newsom kirjoitti:
My name is Brian Newsom and I am just beginning to use/work on the scipy library, specifically the quadpack within the integration pack. Because of the recursion and wrapping methods used now, running nquad (scipy 0.13) function is slow-the fortran and python are nested in a way that requires switching back and forth between the two often. Ideally this inefficiency could be overcome by translating the main interface (quadpack.py) into fortran and just having a minimal python wrapper that is not used in the recursion.
The first question: did you measure the overhead involved in the Python code, and determined that it dominates the computation time? If not, that would be the first thing to do, otherwise it's possible to end up with a wild goose chase.
On the face of it, it does not sound likely to me that much will be gained by writing the recursion in a lower-level language. Suppose that each level requires N >> 1 evaluations of the function. The innermost function will be called in total N^3 times, the next level N^2 times and the outer level N times. The inner level involves N^3 calls back to Python, which are not removed by writing the recursion in Fortran. Since N^3 >> N^2, the innermost callback overhead dominates the run time. Of course, the N^2 term has a bigger prefactor than N^3, so this argument is not fully convincing, but it would be best to measure first.
IIRC, the `quad` integration routine can directly call ctypes function pointers, skipping Python overhead. (See the Python ctypes module documentation for details on how to define them.) This requires Scipy
= 0.11.0, however.
There have been plans to improve the interoperability further (with Cython, f2py, and other ways) to interact directly with low-level code in Scipy's computational routines. Putting effort into this direction and trying to speed up the innermost part would seem to me the most sensible direction of further work.
Of course, the above is written assuming your benchmarks don't show me wrong.
I don't believe anyone is currently working on `nquad`, so go ahead. However, note that due to the current situation with Mingw Gfortran compiler on Windows, there may be issues in accepting Fortran 90 code.
Best regards,
- -- Pauli Virtanen -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux)
iEYEARECAAYFAlIs1a8ACgkQ6BQxb7O0pWBYtwCdHrXw+wXD4UMf9CLzHHe3hkpM V6QAniW2X6HeC4YWDHnDJwEZ5suxnXvi =3ptO -----END PGP SIGNATURE-----
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
09.10.2013 20:33, Nathan Woods kirjoitti:
I took a look at the quad routine, and it appears that its ctypes integration is limited to single-variable functions. I'm not totally sure why that would be, but it's a bit crippling for anything that includes extra args, as well as any attempt at multidimensional integrals. Can anyone confirm this? I'm pretty new at the guts of SciPy and C in general.
See http://article.gmane.org/gmane.comp.python.scientific.devel/18285
I gather that there's not an established template for how to do this sort of thing? For instance, is there some other part of SciPy where someone has integrated Cython (say) with low-level libraries such that callbacks to python are avoided? Nathan Woods On Wed, Oct 9, 2013 at 12:23 PM, Pauli Virtanen <pav@iki.fi> wrote:
09.10.2013 20:33, Nathan Woods kirjoitti:
I took a look at the quad routine, and it appears that its ctypes integration is limited to single-variable functions. I'm not totally sure why that would be, but it's a bit crippling for anything that includes extra args, as well as any attempt at multidimensional integrals. Can anyone confirm this? I'm pretty new at the guts of SciPy and C in general.
See http://article.gmane.org/gmane.comp.python.scientific.devel/18285
_______________________________________________ SciPy-Dev mailing list SciPy-Dev@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev
09.10.2013 21:35, Nathan Woods kirjoitti:
I gather that there's not an established template for how to do this sort of thing?
For instance, is there some other part of SciPy where someone has integrated Cython (say) with low-level libraries such that callbacks to python are avoided?
No, this thing was added as a quick hack. As I wrote in my other mail, this should be generalized to a form where the same mechanism can be reused also in different places. It might also be desirable to not tie it in with Ctypes at all; [*] we might just want to have a scipy.CFunction which would provide the restricted subset of arbitrary parameter passing we want to support, and in a form that it easy to deal with in C. As you can see from the `quad` C code, Ctypes is a bit painful here. [*] Of course, conversion from Ctypes function pointers could be provided. -- Pauli Virtanen
participants (3)
-
Brian Lee Newsom -
Nathan Woods -
Pauli Virtanen