[Numpy-discussion] formatting issues, locale and co

Sun Dec 28 01:38:29 EST 2008

On Sat, Dec 27, 2008 at 10:27 PM, David Cournapeau <
david at ar.media.kyoto-u.ac.jp> wrote:

> Hi,
>
>    While looking at the last failures of numpy trunk on windows for
> python 2.5 and 2.6, I got into floating point number formatting issues;
> I got deeper and deeper, and now I am lost. We have several problems:
>    - we are not consistent between platforms, nor are we consistent
> with python
>    - str(np.float32(a)) is locale dependent, but python str method is
> not (locale.str is)
>    - formatting of long double does not work on windows because of the
> broken long double support in mingw.
>
> 1 consistency problem:
> ----------------------
>
> python -c "a = 1e20; print a" -> 1e+020
> python26 -c "a = 1e20; print a" -> 1e+20
>
> In numpy, we use PyOS_snprintf for formatting, but python itself uses
> PyOS_ascii_formatd - which has different behavior on different versions
> of python. The above behavior can be simply reproduced in C:
>
> #include <Python.h>
>
> int main()
> {
>    double x = 1e20;
>    char c[200];
>
>    PyOS_ascii_format(c, sizeof(c), "%.12g", x);
>    printf("%s\n", c);
>    printf("%g\n", x);
>
>    return 0;
> }
>
> On 2.5, this will print:
>
> 1e+020
> 1e+020
>
> But on 2.6, this will print:
>
> 1e+20
> 1e+020
>
> 2 locale dependency:
> --------------------
>
> Another issue is that our own formatting is local dependent, whereas
> python isn't:
>
> import numpy as np
> import locale
> locale.setlocale(locale.LC_NUMERIC, 'fr_FR')
> a = 1.2
>
> print "str(a)", str(a)
> print "locale.str(a)", locale.str(a)
> print "str(np.float32(a))", str(np.float32(a))
> print "locale.str(np.float32(a))", locale.str(np.float32(a))
>
> Returns:
>
> str(a) 1.2
> locale.str(a) 1,2
> str(np.float32(a)) 1,2
> locale.str(np.float32(a)) 1,20000004768
>
> I thought about copying the way python does the formatting in the trunk
> (where discrepancies between platforms have been fixed), but this is not
> so easy, because it uses a lot of code from different places - and the
> code needs to be adapted to float and long double. The other solution
> would be to do our own formatting, but this does not sound easy:
> formatting in C is hard. I am not sure about what we should do, if
> anyone else has any idea ?
>

I think the first thing to do is make a decision on locale. If we chose to
support locales I don't see much choice but to depend Python because it's
too much work otherwise, and work not directly related to Numpy at that. If
we decide not to support locales then we can do our own formatting if we
need to using a fixed choice of locale. There is a list of snprintf
implementations here <http://www.ijs.si/software/snprintf/>.
Trio<http://daniel.haxx.se/projects/trio/>looks like a mature project
and has an MIT license, which I think is a
license compatible with Numpy.

I'm inclined to just fix the locale and ignore the rest until Python gets
things sorted out. But I'm lazy...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20081227/f97f4f0c/attachment.html>