[Tutor] string.uppercase: too many for locale
kent37 at tds.net
Thu Jan 11 12:10:15 CET 2007
Barnaby Scott wrote:
> Thanks, but this raises various questions:
I am in no way an expert on this, I am guessing...if anyone else knows
for sure what is going on, please let me know!
> Why would my locale have 'changed' - and from what?
The docs for the locale module say "According to POSIX, a program which
has not called setlocale(LC_ALL, '') runs using the portable 'C'
locale. Calling setlocale(LC_ALL, '') lets it use the default locale as
defined by the LANG variable." So "from what" is the so-called 'C'
locale; this is the setting on my machine. Why it would change I don't
know - due to an environment variable setting or perhaps a setlocale()
call in sitecustomize.py?
> What *would* be the appropriate locale given that I am in the UK and use
> English, and how would I set it?
I think your locale is appropriate...
> Why on earth does the ['English_United Kingdom', '1252'] locale setting
> consider ŠŒŽŸÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ to be appropriate?
The locale specifies 1252 as the encoding. Presumably this is Windows
code page 1252. All of the characters you list are valid uppercase
characters in that encoding.
> Is this less to do with Python than the operating system?
Python's locale capabilities are built on those of the underlying C
language so that is probably where this is coming from.
> Where can I read more on the subject?
comp.lang.python is where I would start, ask your question there. State
you confusion and ask your real question - how do I find out the actual
uppercase letters for the language in use? If you come at it from
"Python is broken, why does it do such a stupid thing" you will get the
explanation of why it is not broken rather than the solution to your
> Sorry for all the open-ended questions, but I am baffled by this and can
> find no information. Sadly, just using string.ascii_uppercase is not a
> solution because I am trying to develop something for different locales,
> but only want the actual letters that a particular language uses to be
> returned - e.g. English should be A-Z only, Swedish should be A-Z + ÅÄÖ
> (only) etc. The thing I really want to avoid is having to hard-code for
> every language on the planet - surely this is the whole point of locale
> settings, and locale-dependent functions and constants?
> Barnaby Scott
> Tutor maillist - Tutor at python.org
More information about the Tutor