[Python-Dev] Caching float(0.0)

Terry Reedy tjreedy at udel.edu
Tue Oct 3 02:05:28 CEST 2006


"Kristján V. Jónsson" <kristjan at ccpgames.com> wrote in message 
news:129CEF95A523704B9D46959C922A280002FE99A9 at nemesis.central.ccp.cc...
>Anyway, Skip noted that 50% of all floats are whole numbers between -10 
>and 10 inclusive,

Please, no.  He said something like this about *non-floating-point 
applications* (evidence unspecified, that I remember).  But such 
applications, by definition, usually don't have enough floats for caching 
(or conversion time) to matter too much.

For true floating point measurements (of temperature, for instance), 
'integral' measurements (which are an artifact of the scale used (degrees F 
versus C versus K)) should generally be no more common than other realized 
measurements.

Thirty years ago, a major stat package written in Fortran (BMDP) required 
that all data be stored as (Fortran 4-byte) floats for analysis.  So a 
column of yes/no or male/female data would be stored as 0.0/1.0 or perhaps 
1.0/2.0.  That skewed the distribution of floats.  But Python and, I hope, 
Python apps, are more modern than that.

>and this is the code that I employ in our python build today:

[snip]

For the analysis of typical floating point data, this is all pointless and 
a complete waste of time.  After a billion comversions or so, I expect the 
extra time might add up to something noticeable.

> From: "Martin v. Löwis" [mailto:martin at v.loewis.de]
>> I'm worried about the penalty that this causes in terms of
>> run-time cost.

Me too.

>> Also, how do you chose what values to cache?

At one time (don't know about today), it was mandatory in some Fortran 
circles to name the small float constants used in a particular program with 
the equivalent of C #defines.  In Python,
zero = 0.0, half = 0.5, one = 1.0, twopi = 6.29..., eee = 2.7..., phi = 
.617..., etc. (Note that naming is not restricted to integral or otherwise 
'nice' values.)

The purpose then was to allow easy conversion from float to double to 
extended double.  And in some cases, it also made the code clearer.  With 
Python, the same procedure would guarantee only one copy (caching) of the 
same floats for constructed data structures.

Float caching strikes me a a good subject for cookbook recipies, but not, 
without real data and a willingness to slightly screw some users, for the 
default core code.

Terry Jan Reedy





More information about the Python-Dev mailing list