[Tutor] string.uppercase: too many for locale
Kent Johnson
kent37 at tds.net
Thu Jan 11 01:28:46 CET 2007
Barnaby Scott wrote:
> Can anyone explain the following: I was getting string.uppercase
> returning an unexpected number of characters, given that the Python Help
> says that it should normally be A-Z. Being locale-dependent, I checked
> that my locale was not set to something exotic, and sure enough it is
> only what I expected - see below:
>
>
> IDLE 1.1 ==== No Subprocess ====
> >>> import locale, string
> >>> locale.getlocale()
> ['English_United Kingdom', '1252']
> >>> print string.uppercase
> ABCDEFGHIJKLMNOPQRSTUVWXYZŠŒŽŸÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ
> >>> print string.lowercase
> abcdefghijklmnopqrstuvwxyzƒšœžßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ
> >>>
>
> What am I missing here? Surely for UK English, I really should just be
> getting A-Z and a-z. In case it is relevant, the platform is Windows 2000.
Interesting. Here is what I get:
>>> import locale, string
>>> locale.getlocale()
(None, None)
>>> string.uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
Somehow the locale for your system has changed from the 'C' locale. If I
set the default locale I get similar results to yours:
>>> locale.setlocale(locale.LC_ALL, '')
'English_United States.1252'
>>> locale.getlocale()
('English_United States', '1252')
>>> print string.uppercase
ABCDEFGHIJKLMNOPQRSTUVWXYZèîă└┴┬├─┼╞╟╚╔╩╦╠═╬╧╨╤╥╙╘╒╓╪┘┌█▄▌▐
which doesn't print correctly because my console encoding is actually
cp437 not cp1252.
It looks like string.uppercase is giving you all the characters which
are uppercase in the current encoding, which seems reasonable. You can
use string.ascii_uppercase if you want just A-Z.
Kent
More information about the Tutor
mailing list