[Numpy-discussion] Disabling Extended Precision in NumPy (like -ffloat-store)

Sat Apr 24 01:07:32 EDT 2010

Adrien Guillon wrote:
> Thank you for your questions... I'll answer them now.
>
> The motivation behind using Python and NumPy is to be able to "double
> check" that the numerical algorithms work okay in an
> engineer/scientist friendly language.  We're basically prototyping a
> bunch of algorithms in Python, validating that they work right... so
> that when we move to the GPU we (hopefully) know that the algorithms
> are okay sequentially... any problems we come against therefore would
> have to be something else with the implementation (i.e. we don't want
> to be debugging too much at once).  We are only targeting GPUs that
> support IEEE floating point... in theory the results should be similar
> on the GPU and CPU under certain assumptions (that the floating point
> is limited to single precision throughout... which Intel processors
> are a bit funny about).
>
> The main concern I have, and thus my motivation for wanting a
> restricted mode, is that parts of the algorithm may not work properly
> if done in extended precision.  We are trying to ensure that
> theoretically there should be no problems, but realistically it's nice
> to have an extra layer of protection where we say "oh, wait, that's
> not right" when we look at the results.
>
> The idea here, is that if I can ensure there is never extended
> precision in the Python code... it should closely emulate the GPU
> which in fact has no extended precision in the hardware.  Ordinarily,
> it's probably a silly thing to want to do (who would complain about
> extended precision for free)?  But in this case I think it's the right
> decision.
>
> Hope this clarifies the reasons and intentions.
If you are mainly targeting Nvidia GPUs you should check out theano, 
which allows you to prototype algorithms and have theano generate and 
run CUDA GPU code for you.  Translating a NumPy-using algorithm into 
Theano calls is straightforward and fairly quick, and if you're running 
your prototypes on a GPU in the first place, you can be sure that the 
hardware limitations are not being violated. It will also give your 
prototype a speed boost over the CPU version :)

http://deeplearning.net/software/theano/

David