Re: [Numpy-discussion] float128 in fact float80

Oct. 16, 2011

      On Sun, Oct 16, 2011 at 8:33 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
...
Hi,
On Sun, Oct 16, 2011 at 12:28 AM, David Cournapeau <cournape@gmail.com> wrote:
...
On Sun, Oct 16, 2011 at 8:04 AM, Matthew Brett <matthew.brett@gmail.com> wrote:
...
Hi,
On Sat, Oct 15, 2011 at 11:04 PM, Nadav Horesh <nadavh@visionsense.com> wrote:
...
On 32 bit systems it consumes 96 bits (3 x 32). and hence float96
On 64 bit machines it consumes 128 bits (2x64).
The variable size is set for an efficient addressing, while the calculation in hardware is carried in the 80 bits FPU (x87) registers.
Right - but the problem here is that it is very confusing.  There is
something called binary128 in the IEEE standard, and what numpy has is
not that.  float16, float32 and float64 are all IEEE standards called
binary16, binary32 and binary64.
This one is easy: few CPU support the 128 bits float specified in IEEE
standard (the only ones I know are the expensive IBM ones). Then there
are the cases where it is implemented in software (SPARC use the
double-pair IIRC).
So you would need binar80, binary96, binary128, binary128_double_pair,
etc... That would be a nightmare to support, and also not portable:
what does binary80 become on ppc ? What does binary96 become on 32
bits Intel ? Or on windows (where long double is the same as double
for visual studio) ?
binary128 should only be thought as a (bad) synonym to np.longdouble.
What would be the nightmare to support - the different names for the
different precisions?
Well, if you have an array of np.float80, what does it do on ppc, or
windows, or solaris ? You will have a myriad of incompatible formats,
and the only thing you gained by naming them differently is that you
lose the ability of using the code on different platforms. The
alternative is to implement in software a quadruple precision number.

Using extended precision is fundamentally non-portable on today's CPU.

David