[Python-Dev] RE: [Spambayes] Question (or possibly a bug report)

Tim Peters tim.one@comcast.net
Thu, 24 Jul 2003 01:42:56 -0400


[Meyer, Tony]
> ...
> How 0.1 becomes 0[.0] in German, when the damage script says it should
> equal 1x10^16, I still don't know.

The "damage" script is apparently incorrect.  I added a sys.muckit()
function to my Python, like so (in sysmodule.c):

static PyObject *
sys_muckit(PyObject *self, PyObject *args)
{
#include <locale.h>
	setlocale(LC_NUMERIC, "german");
	Py_INCREF(Py_None);
	return Py_None;
}

Then

>>> import marshal
>>> s = marshal.dumps(0.001)
>>> s
'f\x050.001'
>>> marshal.loads(s)
0.001
>>> import sys
>>> sys.muckit()
>>> marshal.loads(s)
0.0
>>>

So when the marshaled representation of 0.001 is loaded under "german"
LC_NUMERIC here, we get back exactly 0.0.  I'm not sure why.  More, now that
I've screwed up the locale:

>>> 0.001
0.0
>>> 0.1
0.0
>>> 1.0
1.0
>>> 0.99999999
0.0
>>> 123.456
123.0
>>> 1e20
1e+020
>>> 2.34e20
2.0
>>

So the obvious <wink> answers are:

1. When LC_NUMERIC is "german", MS C's atof() stops at the first
   period it sees.

2. Python's emulation of locale-aware atof (function atof in file
   Lib/locale.py) doesn't correctly emulate the platform C atof()
   in this case.  I don't know why that is (and am waaaaay out of
   time for today), but the "damage" script used locale.atof(), so
   drew wrong conclusions about MS locale reality.