python 2.7.12 on Linux behaving differently than on Windows

BartC bc at freeuk.com
Thu Dec 8 18:19:55 EST 2016


On 08/12/2016 22:31, Chris Angelico wrote:
> On Fri, Dec 9, 2016 at 8:42 AM, BartC <bc at freeuk.com> wrote:
>> Python3 tells me that original, lower-case and upper-case versions are:
>>
>> ßẞıİiIÅσςσ
>> ßßıi̇iiåσςσ
>> SSẞIİIIÅΣΣΣ
>
> Now lower-case the upper-case version and see what you get. And
> upper-case the lower-case version. Because x.upper().lower() should be
> the same as x.lower(), right? And x.lower().upper().lower() is the
> same too. Right?

I get this (although I suspect Thunderbird will screw up the tabs); the 
code I used follows at the end:

	 L	 U	L->U	U->L

A 	 a 	 A 	 A 	 a      Letters
65 	 97 	 65 	 65 	 97     Ordinals
1 	 1 	 1 	 1 	 1      Lengths
   	   	   	   	
32 	 32 	 32 	 32 	 32
1 	 1 	 1 	 1 	 1

ß 	 ß 	 SS 	 SS 	 ss
223 	 223 	 83 	 83 	 115
1 	 1 	 2 	 2 	 2

ẞ 	 ß 	 ẞ 	 SS 	 ß
7838 	 223 	 7838 	 83 	 223
1 	 1 	 1 	 2 	 1

ı 	 ı 	 I 	 I 	 i
305 	 305 	 73 	 73 	 105
1 	 1 	 1 	 1 	 1

İ 	 i̇ 	 İ 	 İ 	 i̇
304 	 105 	 304 	 73 	 105
1 	 2 	 1 	 2 	 2

i 	 i 	 I 	 I 	 i
105 	 105 	 73 	 73 	 105
1 	 1 	 1 	 1 	 1

I 	 i 	 I 	 I 	 i
73 	 105 	 73 	 73 	 105
1 	 1 	 1 	 1 	 1

Å 	 å 	 Å 	 Å 	 å
8491 	 229 	 8491 	 197 	 229
1 	 1 	 1 	 1 	 1

σ 	 σ 	 Σ 	 Σ 	 σ
963 	 963 	 931 	 931 	 963
1 	 1 	 1 	 1 	 1

ς 	 ς 	 Σ 	 Σ 	 σ
962 	 962 	 931 	 931 	 963
1 	 1 	 1 	 1 	 1

σ 	 σ 	 Σ 	 Σ 	 σ
963 	 963 	 931 	 931 	 963
1 	 1 	 1 	 1 	 1

z 	 z 	 Z 	 Z 	 z
122 	 122 	 90 	 90 	 122
1 	 1 	 1 	 1 	 1

I've added A, space and z.

As I said some characters have ill-defined upper and lower case 
conversions, even if some aren't as esoteric as I'd thought.

In English however the conversions are perfectly well defined for A-Z 
and a-z, while they are not meaningful for characters such as space, and 
for digits.

In English such conversions are immensely useful, and it is invaluable 
for many purposes to have upper and lower case interchangeable (for 
example, you don't have separate sections in a dictionary for letters 
starting with A and those starting with a).

So it it perfectly possible to have case conversion defined for English, 
while other alphabets can do what they like.

It is a little ridiculous however to have over two thousand distinct 
files all with the lower-case normalised name of "harry_potter".

What were we talking about again? Oh yes, belittling me because I work 
with Windows!

---------------
tab="	"

def ord1(c):	return ord(c[0])

def showcases(c):
	print (c,tab,c.lower(),tab,c.upper(),tab,c.lower().upper(),tab,
	c.upper().lower())

def showcases_ord(c):
	print (ord1(c),tab,ord1(c.lower()),tab,ord1(c.upper()),tab,
	ord1(c.lower().upper()),tab,ord1(c.upper().lower()))

def showcases_len(c):
	print (len(c),tab,len(c.lower()),tab,len(c.upper()),tab,
	len(c.lower().upper()),	tab,len(c.upper().lower()))

s="A ßẞıİiIÅσςσz"

print ("Org	L	U	L->U	U->L")

for c in s:
	showcases(c)
	showcases_ord(c)
	showcases_len(c)
	print()

-- 
Bartc


More information about the Python-list mailing list