Changing filenames from Greeklish => Greek (subprocess complain)
Michael Torrie
torriem at gmail.com
Wed Jun 5 01:40:39 EDT 2013
On 06/04/2013 10:15 PM, Νικόλαος Κούρας wrote:
> One of my Greek filenames is "Ευχή του Ιησού.mp3". Just a Greek
> filename with spaces. Is there a problem when a filename contain both
> english and greek letters? Isn't it still a unicode string?
>
> All i did in my CentOS was 'mv "Euxi tou Ihsou.mp3" "Ευχή του
> Ιησού.mp3"
>
> and the displayed filename after 'ls -l' returned was:
>
> is -rw-r--r-- 1 nikos nikos 3511233 Jun 4 14:11 \305\365\367\336\
> \364\357\365\ \311\347\363\357\375.mp3
>
> There is no way at all to check the charset used to store it in hdd?
> It should be UTF-8, but it doesn't look like it. Is there some linxu
> command or some python command that will print out the actual
> encoding of '\305\365\367\336\ \364\357\365\
> \311\347\363\357\375.mp3' ?
I can see that you are starting to understand things. I can't answer
your question (don't know the answer), but you're correct about one
thing. A filename is just a sequence of bytes. We'd hope it would be
utf-8, but it could be anything. Even worse, it's not possible to tell
from a byte stream what encoding it is unless we just try one and see
what happens. Text editors, for example, have to either make a guess
(utf-8 is a good one these days), or ask, or try to read from the first
line of the file using ascii and see if there's a source code character
set command to give it an idea.
More information about the Python-list
mailing list