Strange locale problem with Python 3
Hi While helping Brandon Rhodes to port PyEphem[1] to Python 3 we struggled over a strange locale-related problem on OS X. PyEphem is a library which can do astronomical computations like tracking the position of stars, planets and earth satellites relative to the earth's position. When testing out the Python 3 release of PyEphem I noticed that on my OS X laptop a lot of calculations were wrong (not completely wrong, but clearly not accurate) compared to Python 2.5. We (well mostly Brandon) were able to track down the problem to the TLE parser (TLE are data file containing the orbital elements of an object) which appears to read most values wrong with python 3. In fact it cut of the decimal parts of all floats (1.123232 got 1, etc). Manually setting LANG and LC_ALL to C solved the problem. It turns out that some parts of Python 3 got more locale dependent on some platforms. The only platform I am aware of is OS X, on Linux Python 3 appears to behave like Python 2.x did. In case of PyEphem the problem was in the C extension which got more locale dependent, for example atof() or scanf() with Python 3 now expected the german decimal-delimiter ',' instead of the '.' in floats (3.14 vs. 3,14). On the other hand the constructor of float still expects '.' all the time. But the built-in function strptime() honors locales with Python 3 and expects german week day. I've written a simple script and a simple C extension which illustrates the problem. Both the extension and the script run python 2.x and python 3, so you can easily compare the result while executing the script in different environments. I was only able to reproduce the problem on OS X (10.5) and using a german locale like "de_CH.UTF-8". When manually setting LC_ALL=C, the differences disappears. I can't imagine that his behavior was really intended, and I hope the test case helps you guys to identify/fix this problem. Download the test case from: http://github.com/retoo/py3k-locale-problem/tarball/master or get it using git: git://github.com/retoo/py3k-locale-problem.git You can use the following steps to build it: $ python2.5 setup.py build $ python3.0 setup.py build To run the tests with python 2.5, enter: $ (cd build/lib*-2.5; python2.5 py3k_locale_problem.py) ... for 3.0 ... $ (cd build/lib*-3.0; python3.0 py3k_locale_problem.py) In the file 'results.txt' you can see the output from my OS X system. Cheers, Reto Schüttel [1] http://rhodesmill.org/pyephem/
On Sun, Feb 01, 2009, Reto Sch?ttel wrote:
While helping Brandon Rhodes to port PyEphem[1] to Python 3 we struggled over a strange locale-related problem on OS X.
Please file a report at bugs.python.org -- that's the only way to ensure that this gets tracked. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.
Sorry! Somehow my mailclient managed to mess up my message, here another copy of it, hopefully in better conditions than last time. Aahz: Good idea, I've created a bug report. Bug ID 5125 Hi While helping Brandon Rhodes to port PyEphem[1] to Python 3 we struggled over a strange locale-related problem on OS X. PyEphem is a library which can do astronomical computations like tracking the position of stars, planets and earth satellites relative to the earth's position. When testing out the Python 3 release of PyEphem I noticed that on my OS X laptop a lot of calculations were wrong (not completely wrong, but clearly not accurate) compared to Python 2.5. We (well mostly Brandon) were able to track down the problem to the TLE parser (TLE are data file containing the orbital elements of an object) which appears to read most values wrong with python 3. In fact it cut of the decimal parts of all floats (1.123232 got 1, etc). Manually setting LANG and LC_ALL to C solved the problem. It turns out that some parts of Python 3 got more locale dependent on some platforms. The only platform I am aware of is OS X, on Linux Python 3 appears to behave like Python 2.x did. In case of PyEphem the problem was in the C extension which got more locale dependent, for example atof() or scanf() with Python 3 now expected the german decimal-delimiter ',' instead of the '.' in floats (3.14 vs. 3,14). On the other hand the constructor of float still expects '.' all the time. But the built-in function strptime() honors locales with Python 3 and expects german week day. I've written a simple script and a simple C extension which illustrates the problem. Both the extension and the script run python 2.x and python 3, so you can easily compare the result while executing the script in different environments. I was only able to reproduce the problem on OS X (10.5) and using a german locale like "de_CH.UTF-8". When manually setting LC_ALL=C, the differences disappears. I can't imagine that his behavior was really intended, and I hope the test case helps you guys to identify/fix this problem. Download the test case from: http://github.com/retoo/py3k-locale-problem/tarball/master or get it using git: git://github.com/retoo/py3k-locale-problem.git You can use the following steps to build it: $ python2.5 setup.py build $ python3.0 setup.py build To run the tests with python 2.5, enter: $ (cd build/lib*-2.5; python2.5 py3k_locale_problem.py) ... for 3.0 ... $ (cd build/lib*-3.0; python3.0 py3k_locale_problem.py) In the file 'results.txt' you can see the output from my OS X system. Cheers, Reto Schuettel [1] http://rhodesmill.org/pyephem/
On 2009-02-01 19:44, Reto Schüttel wrote:
Hi
While helping Brandon Rhodes to port PyEphem[1] to Python 3 we struggled over a strange locale-related problem on OS X. PyEphem is a library which can do astronomical computations like tracking the position of stars, planets and earth satellites relative to the earth's position. When testing out the Python 3 release of PyEphem I noticed that on my OS X laptop a lot of calculations were wrong (not completely wrong, but clearly not accurate) compared to Python 2.5. We (well mostly Brandon) were able to track down the problem to the TLE parser (TLE are data file containing the orbital elements of an object) which appears to read most values wrong with python 3. In fact it cut of the decimal parts of all floats (1.123232 got 1, etc). Manually setting LANG and LC_ALL to C solved the problem.
It turns out that some parts of Python 3 got more locale dependent on some platforms. The only platform I am aware of is OS X, on Linux Python 3 appears to behave like Python 2.x did.
This is probably due to the unconditional call to setlocale() in pythonrun.c: /* Set up the LC_CTYPE locale, so we can obtain the locale's charset without having to switch locales. */ setlocale(LC_CTYPE, ""); In Python 2, no such call is made and as a result the C lib defaults to the "C" locale. Calling setlocale() in an application is always dangerous due to the many side-effects this can have on the C lib parsing and formatting APIs. If this is done just to figure the environment's locale settings, then it's better to reset the locale to the one that was active before the setlocale() call. Python 2 uses this approach. Python 3 does not.
In case of PyEphem the problem was in the C extension which got more locale dependent, for example atof() or scanf() with Python 3 now expected the german decimal-delimiter ',' instead of the '.' in floats (3.14 vs. 3,14). On the other hand the constructor of float still expects '.' all the time. But the built-in function strptime() honors locales with Python 3 and expects german week day.
I've written a simple script and a simple C extension which illustrates the problem. Both the extension and the script run python 2.x and python 3, so you can easily compare the result while executing the script in different environments.
I was only able to reproduce the problem on OS X (10.5) and using a german locale like "de_CH.UTF-8". When manually setting LC_ALL=C, the differences disappears.
I can't imagine that his behavior was really intended, and I hope the test case helps you guys to identify/fix this problem.
Download the test case from: http://github.com/retoo/py3k-locale-problem/tarball/master or get it using git: git://github.com/retoo/py3k-locale-problem.git
You can use the following steps to build it:
$ python2.5 setup.py build $ python3.0 setup.py build
To run the tests with python 2.5, enter: $ (cd build/lib*-2.5; python2.5 py3k_locale_problem.py) ... for 3.0 ... $ (cd build/lib*-3.0; python3.0 py3k_locale_problem.py)
In the file 'results.txt' you can see the output from my OS X system.
Cheers, Reto Schüttel
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 02 2009)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
participants (4)
-
Aahz
-
M.-A. Lemburg
-
Reto Schuettel
-
Reto Schüttel