Solutions for LC_NUMERIC, was Re: [Python-Dev] Re: Be Honest about LC_NUMERIC [REPOST]

Christian Reis kiko at async.com.br
Wed Sep 3 11:41:55 EDT 2003


On Tue, Sep 02, 2003 at 10:14:10PM -0400, Tim Peters wrote:
> > That said: Implementations might choose to ignore the standard in that
> > respect. This issue just supports my thesis that the patch is
> > complicated: If I have to read the C99 standard to find out whether it
> > is correct, it must be complicated. I doubt either the submitters or
> > the original author of the code did that exercise...
> 
> Well, any time a C library function gets called, you may have to scour the
> standard to answer questions about it -- and if they're libm functions you
> can just trust me that the standard doesn't contain useful answers <wink>.
> That's what comments are for.

Well, I disagree that the patch is complicated (it's not -- the code was
preserved to simplify porting fixes to and from glib, but that's not a
resolution and can be changed); the problem is that the corner cases
here aren't trivial to grasp. We have libc functions that rely on global
state (locale information), with no locking primitives, and no XP
support for locale-independent API.

When I wrote to python-dev initially, I wasn't aware of all the issues
at hand; what I really wanted was to bring up the matter and try and
find ways to solve the problem. The proposal I brought up is a WIP, and
I expected it to change after initial discussion had come up (since all
Martin had said previous to my PEP was "no" ;-)

I have come to agree with Martin (but only after perceiving Tim's
problem with Windows and Outlook) that it would be better if we found
platform-specific locale-independent conversion functions. However, it 
seems that we won't be able to find them on *all* platforms (and, from
Tim's findings -- I don't have access to Windows -- they're missing on a
major platform).

Now, agreeing on the solution depends on finding the correct compromise.
Here's my reasoning:

    - The current implementation is broken -- it's not thread-safe, and
      it makes the locale inconsistent in applications that change
      LC_NUMERIC via the locale module.

    - We should try and use locale-independent primitives where
      available: they offer thread-safety, apart from solving the
      immediate problem.
      
      This is fixed with Gustavo's patch *on glibc platforms*, which
      uses strtod_l and the glibc 2.3 thread-aware uselocale() and
      snprintf() API.

      This is *not* fixed with Gustavo's patch for other platforms.

    - On platforms where locale-independent primitives are unavailable,
      we need to provide our own solution. There are three alternatives
      we have explored here:

        a) massage the data to/from a locale-accepted format and call
           the locale-dependent libc functions. This is *not*
           thread-safe.

        b) provide our own, complete, implementation of strtod and
           snprintf.

        c) link against a library that offers a locale-safe version of
           the functions.

      Alternatives b) and c) would be thread-safe. a) is not.

      The problem with alternative b) is finding somebody with the free
      cycles to implement the function *AND* convincing python-dev that
      the code is correct. Gustavo did some looking through the glibc
      code for the functions in question yesterday and, well, if people
      scoffed at code from glib, I dare them to review that code.

      I assume c) could be provided on Windows with cygwin.

Solution b) above would solve the problem completely. Solution c) here
would solve the problem on platforms where such libraries were
available. If nobody can find, contribute, or accept any of the
solutions, then I suggest using solution a) and release noting the
problem with text such as

    "On glibc-based systems, Python is safe from runtime changes in the
    LC_NUMERIC locale category. On other platforms, changing LC_NUMERIC
    at runtime may cause float conversion routines to fail. This may
    impact the Python interpreter internally."

Note that apart from the glibc part, this is also true today (it's just
that a C library is the only thing able to change LC_NUMERIC right
now). In the worst case (no thread-safe locale) we are not at lot worse
than where we started -- we are not thread-safe, but at least we can get
locale consistency for applications that don't toy with LC_NUMERIC at
runtime (when it's none of their business <wink>).

Take care,
--
Christian Reis, Senior Engineer, Async Open Source, Brazil.
http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL



More information about the Python-Dev mailing list