[Numpy-discussion] float128 in fact float80

Sun Oct 16 03:04:37 EDT 2011

Hi,

On Sat, Oct 15, 2011 at 11:04 PM, Nadav Horesh <nadavh at visionsense.com> wrote:
> On 32 bit systems it consumes 96 bits (3 x 32). and hence float96
> On 64 bit machines it consumes 128 bits (2x64).
> The variable size is set for an efficient addressing, while the calculation in hardware is carried in the 80 bits FPU (x87) registers.

Right - but the problem here is that it is very confusing.  There is
something called binary128 in the IEEE standard, and what numpy has is
not that.  float16, float32 and float64 are all IEEE standards called
binary16, binary32 and binary64.

Thus it was natural for me to assume wrongly that float128 was the
IEEE standard.  I'd therefore assume that it could store all the
integers up to 2**113 exactly, and so on.

On the other hand, if I found out that the float80 dtype in fact took
128 bits of storage, I'd rightly conclude that the data were being
padded out with zeros and not be very surprised.

I'd also I think find it easier to understand what was going on if
there were float80 types on 32-bit and 64-bit, but they had different
itemsizes.

If there was one float80 type (with different itemsizes on 32, 64 bit)
then I would not have to write guard try .. excepts around my use of
the types to keep compatible across platforms.

So float80 on both platforms seems like the less confusing option to me.

Best,

Matthew