Changing filenames from Greeklish => Greek (subprocess complain)
Νικόλαος Κούρας
nikos.gr33k at gmail.com
Thu Jun 6 14:46:20 EDT 2013
Τη Πέμπτη, 6 Ιουνίου 2013 3:44:52 μ.μ. UTC+3, ο χρήστης Steven D'Aprano έγραψε:
> py> s = '999-Eυχή-του-Ιησού'
> py> bytes_as_utf8 = s.encode('utf-8')
> py> t = bytes_as_utf8.decode('iso-8859-7', errors='replace')
> py> print(t)
> 999-EΟΟΞ�-ΟΞΏΟ-ΞΞ·ΟΞΏΟ
errors='replace' mean dont break in case or error?
You took the unicode 's' string you utf-8 bytestringed it.
Then how its possible to ask for the utf8-bytestring to decode back to unicode string with the use of a different charset that the one used for encoding and thsi actually printed the filename in greek-iso?
> So that demonstrates part of your problem: even though your Linux system
> is using UTF-8, your terminal is probably set to ISO-8859-7. The
> interaction between these will lead to strange and disturbing Unicode
> errors.
Yes i feel this is the problem too.
Its a wonder to me why putty used by default greek-iso instead of utf-8 !!
Please explain this t me because now that i begin to understand this encode/decode things i begin to like them!
a) WHAT does it mean when a linux system is set to use utf-8?
b) WHAT does it mean when a terminal client is set to use utf-8?
c) WHAT happens when the two of them try to work together?
> So I believe I understand how your file name has become garbage. To fix
> it, make sure that your terminal is set to use UTF-8, and then rename it.
> Do the same with every file in the directory until the problem goes away.
nikos at superhost.gr [~/www/cgi-bin]# echo $LS_OPTIONS
--color=tty -F -a -b -T 0
Is this okey? The '-b' option is for to display a filename in binary mode?
Indeed i have changed putty to use 'utf-8' and 'ls -l' now displays the file in correct greek letters. Switching putty's encoding back to 'greek-iso' then the *displayed* filanames shows in mojabike.
WHAT is being displayed and what is actually stored as bytes is two different thigns right?
Ευχη του Ιησου.mp3
EΟΟΞ�-ΟΞΏΟ-ΞΞ·ΟΞΏΟ
is the way the filaname is displayed in the terminal depending on the encoding the terminal uses, correct? But no matter *how* its being dislayed those two are the same file?
More information about the Python-list
mailing list