[Python-Dev] Python and the Unicode Character Database

Antoine Pitrou solipsis at pitrou.net
Tue Nov 30 21:11:59 CET 2010


Le mardi 30 novembre 2010 à 20:55 +0100, "Martin v. Löwis" a écrit :
> Wrt. to local number parsing, I think that the locale module would be
> way better than the nonsense that Python currently does. In the locale
> module, somebody at least has thought about what specifically
> constitutes a number. The current not-ASCII-but-not-local-either
> approach is just useless.

It depends what you need. If you parse integers it's probably good
enough. And it's better to have a trustable standard (unicode) than a
myriad of ad-hoc, possibly buggy or incomplete, often unavailable,
cultural specifications drafted by OS vendors who have no business (and
no expertise) in drafting them.

At least you can build more sophisticated routines on the simple
information given to you by the unicode database. You cannot build
anything solid on the C locale functions (and even then you are limited
by various issues inherent in the locale semantics, such as the fact
that it relies on process-wide state, which would only be ok, at best,
for single-user applications). There's a reason that e.g. Babel (*)
reimplements locale-like functionality from scratch.

(*) http://pypi.python.org/pypi/Babel/

Regards

Antoine.




More information about the Python-Dev mailing list