[Python-Dev] str.ascii_lower
Guido van Rossum
guido at python.org
Mon Dec 29 12:37:37 EST 2003
> Looking at python.org/sf/866982, I find it troubling that
> there are languages where "I".lower() != "i"
> (for those of you not familiar with Turkish: the lower-case
> letter of "I" is U+0131, LATIN SMALL LETTER DOTLESS I,
> which is \xfd in iso-8859-9).
>
> As a solution, I'd like to propose a new method ascii_lower,
> which is locale-unaware and only works for bytes 65..90
> (returning the byte itself for all other characters).
>
> Similarly, ascii_upper might be needed "for symmetry";
> I don't know whether the symmetry should extend beyond
> those two.
>
> This, in turn, should be used inside the codecs library
> where encoding names are normalized to lower case.
>
> What do you think?
I never though there were locales possible that affected the mappings
inside ASCII either.
But shouldnt' this work just as well if it's only for encoding names
(which I'd hope would be ASCII themselves):
def ascii_lower(s):
return str(unicode(s).lower())
The unicode() call converts ASCII to Unicode, which should always work
for encoding names, and the Unicode lower() is locale-independent.
This seems more elegant than adding yet more methods to the str type.
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-Dev
mailing list