[Tutor] string.uppercase: too many for locale

Kent Johnson kent37 at tds.net
Thu Jan 11 12:10:15 CET 2007

Barnaby Scott wrote:
> Thanks, but this raises various questions:

I am in no way an expert on this, I am guessing...if anyone else knows 
for sure what is going on, please let me know!
> Why would my locale have 'changed' - and from what?

The docs for the locale module say "According to POSIX, a program which 
has not called setlocale(LC_ALL, '') runs using the portable 'C' 
locale. Calling setlocale(LC_ALL, '') lets it use the default locale as 
defined by the LANG variable." So "from what" is the so-called 'C' 
locale; this is the setting on my machine. Why it would change I don't 
know - due to an environment variable setting or perhaps a setlocale() 
call in sitecustomize.py?

> What *would* be the appropriate locale given that I am in the UK and use
> English, and how would I set it?

I think your locale is appropriate...

> Why on earth does the ['English_United Kingdom', '1252'] locale setting

The locale specifies 1252 as the encoding. Presumably this is Windows 
code page 1252. All of the characters you list are valid uppercase 
characters in that encoding.

> Is this less to do with Python than the operating system?

Python's locale capabilities are built on those of the underlying C 
language so that is probably where this is coming from.

> Where can I read more on the subject?

comp.lang.python is where I would start, ask your question there. State 
you confusion and ask your real question - how do I find out the actual 
uppercase letters for the language in use? If you come at it from 
"Python is broken, why does it do such a stupid thing" you will get the 
explanation of why it is not broken rather than the solution to your 


> Sorry for all the open-ended questions, but I am baffled by this and can
> find no information. Sadly, just using string.ascii_uppercase is not a
> solution because I am trying to develop something for different locales,
> but only want the actual letters that a particular language uses to be
> returned - e.g. English should be A-Z only, Swedish should be A-Z + ÅÄÖ
> (only) etc. The thing I really want to avoid is having to hard-code for
> every language on the planet - surely this is the whole point of locale
> settings, and locale-dependent functions and constants?
> Thanks
> Barnaby Scott
