markflorisson88 at gmail.com
Tue Oct 18 23:34:54 CEST 2011
I'm copy/pasting this message to the ML with regard to previous
discussion on cython-users and auto-vectorization (apparently my
forwarded mail got rejected).
Perhaps an approach as listed below would be easier than to generate
Fortran (and deal with the pain of linking with it, distutils
compatibility, forcing the user to install a fortran compiler etc).
------------ Forwarded Message Below ------------
With regards to the discussion on the Cython mail listing regarding
SSE and vectorizing I have a unfinished project which might be of
interest. The project wraps the Orc compiler (
which is a simplified assembly language to create cross platform
thight loop code utilizing SMID architectures.
With some simple test code for sin function approximation i get a
speedup of 10x the
corresponding numpy functions (Single threaded). By utilizing openmp it is
possible to extend this to multiple threads and gain further speedups.
The code is currently just a proof of concept and feel free to adopt and
extend this code if wanted.
More information about the cython-devel