RE: [Python-Dev] RE: [Spambayes] Question (or possibly a bug report)
[Tim Peters on Freitag, 25. Juli 2003]
(Glad you posted this - I was wading through the progress of marshalling (PyOS_snprintf etc) and getting rapidly lost). It's the unmarshalling code that's relevant -- that just passes a string to atof().
So it is :) Amazingly how much clearer things are in the morning. Or after reading your explanation. One of the two ;)
Gives me with gcc version 3.2 20020927 (prerelease): 0.100000 It's possible that glibc doesn't recognize "german" as a legitimate locale name (so that the setlocale() call had no effect).
Ah yes, should have thought of that. I can't get it to accept anything apart from "C" although mingw does and gives the same result as MS C.
atof does have to stop at the first unrecognized character, but atof is locale-dependent, so which characters are and aren't recognized depends on the locale.
As you said, the difference is whether the thousands separator is ignored or not. (Trying to atof "1,000" gives me 1.0, just in the regular old C locale, *and* in "en"). So Python's locale.atof is better than C's, because it properly takes the thousands separator into account. Python does behave correctly, too - locale.atof("1,000") gives me an exception ("C"), 1000.0 ("en") and 1.0 ("german").
it doesn't matter to spambayes either way (whether we load .001 as 0.0 as 1.0 is a disaster either way).
True, although I am finding this interesting and learning something, which is good for me :)
The way we're using Python with Outlook doesn't meet the documented requirements for using Python, so for now everything that goes wrong here is our problem.
Well, Mark's problem ;)
It would be better if Python didn't use locale-dependent string<->float conversions internally, but that's just not the case (yet).
Is this likely to become the case? (In 2.4, for example?)
Python requires that the (true -- from the C library's POV) LC_NUMERIC category be "C" locale. That isn't English (although it looks a lot like it to Germans <wink>), and we don't care about any category other than LC_NUMERIC here.
My mistake. I should have said "C" and not "English". (It is proving a little difficult (for me, at least) finding a place where the C locale can be put back to "C" so that the plugin works.). =Tony Meyer
[Tim]
It would be better if Python didn't use locale-dependent string<->float conversions internally, but that's just not the case (yet).
[Tony Meyer]
Is this likely to become the case? (In 2.4, for example?)
I think so, and maybe before that. Not in 2.3 final, but maybe in 2.3.1 -- numeric locale problems can be catastrophic to Python programmers, so I'm comfortable arguing the case for calling it a bugfix. Whether it happens depends on who's willing and able to do the work, of course. There's a patch pending on a Python tracker to do it, but that uses a pile of code borrowed from glibc, and that's got problems of its own.
"Tim Peters"
[Tim]
It would be better if Python didn't use locale-dependent string<->float conversions internally, but that's just not the case (yet).
[Tony Meyer]
Is this likely to become the case? (In 2.4, for example?)
I think so, and maybe before that. Not in 2.3 final, but maybe in 2.3.1 -- numeric locale problems can be catastrophic to Python programmers, so I'm comfortable arguing the case for calling it a bugfix.
Lib Manual 3.19 says clearly that the marshal format for a specific release is system independent and tranportable. It includes floats as one of the types so supported. If, as I have gathered from this thread, the format for floats is actually, under certain circumstances, whimsy-dependent, I think a warning should be added to the doc until the bug is fixed. Terry J. Reedy
[Terry Reedy]
Lib Manual 3.19 says clearly that the marshal format for a specific release is system independent and tranportable. It includes floats as one of the types so supported. If, as I have gathered from this thread, the format for floats is actually, under certain circumstances, whimsy-dependent, I think a warning should be added to the doc until the bug is fixed.
The warning is there, but it's in the locale module docs (see the "For extension writers and programs that embed Python" section). It's a documented requirement that LC_NUMERIC be "C" when using Python, and violating that is akin to dividing by 0 in C: nothing is defined if you break the rules. The .pyc problem is only one of the things that can go wrong, and is getting all the attention here just because it *is* going wrong in the spambayes Outlook addin. Note that you can't fall into this trap running a pure Python program. It requires that you also run some non-Python code in the same process that mucks with C runtime state in a forbidden-by-Python way. Python can't *stop* non-Python C code from screwing up the locale, it can only document (and does document) that Python may not work as intended if that occurs. Since Python is functioning as designed and documented in such cases, it's hard to sell this as "a bug" in a convincing way. As Python spreads into more embedded contexts, though, the consequences of this design decision (really more of a no-brainer that looks like "a decision" in hindsight: C's horrid locale gimmicks didn't exist when these parts of Python were first written!) justify arguing for a friendlier (to embedding) design.
On Thu, Jul 24, 2003 at 11:25:25PM -0400, Tim Peters wrote:
[Tony Meyer]
Is this likely to become the case? (In 2.4, for example?)
I think so, and maybe before that. Not in 2.3 final, but maybe in 2.3.1 -- numeric locale problems can be catastrophic to Python programmers, so I'm comfortable arguing the case for calling it a bugfix. Whether it happens depends on who's willing and able to do the work, of course. There's a patch pending on a Python tracker to do it, but that uses a pile of code borrowed from glibc, and that's got problems of its own.
FWIW, it's actually borrowed from glib (GTK+'s abstraction library), and we've got tacit permission to include it (and as soon as Alex Larsson is back from vacation, he'll most likely sign the contributor agreement). For details, see my PEP submission at http://www.async.com.br/~kiko/pep-numeric.txt Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL
participants (4)
-
Christian Reis
-
Meyer, Tony
-
Terry Reedy
-
Tim Peters