On Sun, Dec 28, 2008 at 9:38 PM, David Cournapeau <cournape@gmail.com>wrote:
On Sun, Dec 28, 2008 at 4:12 PM, Charles R Harris <charlesr.harris@gmail.com> wrote:
On Sat, Dec 27, 2008 at 11:40 PM, David Cournapeau <david@ar.media.kyoto-u.ac.jp> wrote:
Robert Kern wrote:
We should not support locales. The string representations of these elements should be Python-parseable.
It looks like I was wrong in my analysis of the problem: I thought I was using the most recent implementation of PyOS_* functions in my test codes, but the ones in 2.6 are not the same as the ones in the current trunk. So the problem may be easier to fix that what I first thought: simply providing our own PyOS_ascii_formatd (and similar for float and long double) may be enough, and since we don't care about locale (%Z and %n), the function is simple (and can be pulled out from python sources).
We would then use PyOS_ascii_format* (locale independant) instead of PyOS_snprintf (locale dependant) in str/repr implementation of scalar arrays. Does that sound acceptable to you ?
I put my yesterday work in the fix_float_format branch: - it fixes the locale issue - it fixes the long double issue on windows. - it also fixes some tests (we were not testing single precision formatting but twice double precision instead - the single precision test fails on the trunk BTW).
Curious, I don't see any test failures here. Were the tests actually being run or is something else different in your test setup? Or do you mean the fixed up test fails.
- it handles inf and nan more consistently across platforms (e.g. str(np.log(0)) will be '-inf' on all platforms; on windows, it used to be '-1.#INF' - I was afraid it would broke converting back the string to float, but it is broken anyway before my change, e.g. float('-1.#INF') does not work on windows). - for now, it breaks in windows python 2.5, because float(1e10) used to be 1e+010 on python 2.5 and is 1e+10 on python 2.6 (to be more consistent with C99). But I could simply forces a backward compatibility with python 2.5/2.4, since I can control the number of digits in the exponent in the formatting code.
There are still some problems related for double which I am not sure how to solve:
import numpy as np a = 1e10 print np.float32(a) # -> call format_float print np.float64(a) # -> do not call format_double print np.float96(a) # -> call format_longdouble
I guess the different with float64 comes from its multi-inheritence (that is, it derives from the builtin float, and the rules for print are different that for the other). Is this behavior the expected one ?
Expected, but I would like to see it change because it is kind of frustrating. Fixing it probably involves setting a function pointer in the type definition but I am not sure about that. We might also want to do something about integers, as in Python 3.0 they will all be Python long integers. I don't know if that actually breaks anything in numpy, or how Python 3.0 implements integers, but it might be a good idea not to derive from Python integers. How that will affect indexing speed I don't know. Chuck