[Python-Dev] RE: [Spambayes] Question (or possibly a bug report)

Meyer, Tony T.A.Meyer@massey.ac.nz
Fri, 25 Jul 2003 10:26:25 +1200

[Tim Peters on Freitag, 25. Juli 2003]
> > (Glad you posted this - I was wading through the progress of
> > marshalling (PyOS_snprintf etc) and getting rapidly lost).
> It's the unmarshalling code that's relevant -- that just=20
> passes a string to atof().

So it is :)  Amazingly how much clearer things are in the morning.  Or
after reading your explanation.  One of the two ;)

> > Gives me with gcc version 3.2 20020927 (prerelease):
> > 	0.100000
> It's possible that glibc doesn't recognize "german" as a=20
> legitimate locale name (so that the setlocale() call had no effect).

Ah yes, should have thought of that.  I can't get it to accept anything
apart from "C" although mingw does and gives the same result as MS C.

> atof does have to stop at the first unrecognized character,=20
> but atof is locale-dependent, so which characters are and aren't=20
> recognized depends on the locale.

As you said, the difference is whether the thousands separator is
ignored or not.  (Trying to atof "1,000" gives me 1.0, just in the
regular old C locale, *and* in "en").  So Python's locale.atof is better
than C's, because it properly takes the thousands separator into
account.  Python does behave correctly, too - locale.atof("1,000") gives
me an exception ("C"), 1000.0 ("en") and 1.0 ("german").

> it doesn't matter to spambayes
> either way (whether we load .001 as 0.0 as 1.0 is a disaster either

True, although I am finding this interesting and learning something,
which is good for me :)

> The way we're using Python with Outlook doesn't meet the documented
> requirements for using Python, so for now everything that=20
> goes wrong here is our problem.

Well, Mark's problem ;)

> It would be better if Python didn't use locale-dependent
> string<->float conversions internally, but that's just not=20
> the case (yet).

Is this likely to become the case?  (In 2.4, for example?)

> Python requires that the (true -- from the C library's POV) LC_NUMERIC
> category be "C" locale.  That isn't English (although it looks a lot
> like it to Germans <wink>), and we don't care about any category other
> than LC_NUMERIC here.

My mistake.  I should have said "C" and not "English".  (It is proving a
little difficult (for me, at least) finding a place where the C locale
can be put back to "C" so that the plugin works.).

=3DTony Meyer