[Python-Dev] Unicode locale values in 2.7

Antoine Pitrou solipsis at pitrou.net
Thu Dec 3 12:33:34 CET 2009


Eric Smith <eric <at> trueblade.com> writes:
> 
> But in trunk, the value is just used as-is. So when formating a decimal,
> for example, '\xc2\xa0' is just inserted into the result, such as:
> >>> format(Decimal('1000'), 'n')
> '1\xc2\xa0000'
> This doesn't make much sense,

Why doesn't it make sense? It's normal UTF-8.
The same thing happens when the monetary sign is non-ASCII, see
Lib/test/test_locale.py for an example.

> I believe that the correct solution is to do what py3k does in locale,
> which is to convert the struct lconv values to unicode. But since this
> would be a disruptive change if universally applied, I'd like to propose
> that we only convert to unicode if the values won't fit into a str.

This would still be disruptive, because some programs may rely on these values
being bytestrings in the current locale encoding.

I'd say don't try to fix this, and encourage people to use py3k if they really
want safe unicode+locale. Proper unicode behaviour is one of py3k's main
features after all.

Regards

Antoine.




More information about the Python-Dev mailing list