Changing filenames from Greeklish => Greek (subprocess complain)

Νικόλαος Κούρας nikos.gr33k at gmail.com
Thu Jun 6 06:35:09 EDT 2013


Τη Πέμπτη, 6 Ιουνίου 2013 11:50:55 π.μ. UTC+3, ο χρήστης Heiko Wundram έγραψε:
> Am 05.06.2013 18:44, schrieb MRAB:
> 
> >  From the previous posts I guessed that the filename might be encoded
> 
> > using ISO-8859-7:
> 
> >
> 
> >  >>> s = b"\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3"
> 
> >  >>> s.decode("iso-8859-7")
> 
> > 'οΏ½οΏ½οΏ½οΏ½\\ οΏ½οΏ½οΏ½\\ οΏ½οΏ½οΏ½οΏ½οΏ½.mp3'
> 
> >
> 
> > Yes, that looks the same.
> 
> 
> 
> Most probably, his terminal is set to ISO-8859-7, so that when he issues 
> 
> the rename command on the command-line of his shell session, the "mv" 
> 
> command gets a stream of bytes as the new file name which happens to be 
> 
> the ISO-8859-7 encoding of the file name he'd like the file to have. 
> 
> This is what's stored on disk.
> 
> 
> 
> So, his biggest problem isn't that the operating system is encoding 
> 
> agnostic wrt. filenames (i.e., treats them as a stream of bytes), but 
> 
> rather that he's using an ISO-7 terminal window when having set up UTF-8 
> 
> as his operating system locale and expects filenames to be encoded in 
> 
> UTF-8 when he's not passing in UTF-8 byte streams from his client 
> 
> computer at all.
> 
> 
> 
> -- 
> 
> --- Heiko.

nikos at superhost.gr [~/www/data/apps]# ls -l | file -
/dev/stdin: ASCII text


# Compute a set of current fullpaths
fullpaths = set()
path = "/home/nikos/public_html/data/apps/"

for root, dirs, files in os.walk(path):
	for fullpath in files:
		fullpaths.add( os.path.join(root, fullpath) )

----------------------------
[Thu Jun 06 13:34:19 2013] [error] [client 79.103.41.173]     cur.execute('''SELECT url FROM files WHERE url = %s''', fullpath.encode('iso-8859-7') )
[Thu Jun 06 13:34:19 2013] [error] [client 79.103.41.173]   File "/usr/local/lib/python3.3/encodings/iso8859_7.py", line 12, in encode
[Thu Jun 06 13:34:19 2013] [error] [client 79.103.41.173]     return codecs.charmap_encode(input,errors,encoding_table)
[Thu Jun 06 13:34:19 2013] [error] [client 79.103.41.173] UnicodeEncodeError: 'charmap' codec can't encode characters in position 34-37: character maps to <undefined>


[Thu Jun 06 13:27:17 2013] [error] [client 79.103.41.173] Traceback (most recent call last):
[Thu Jun 06 13:27:17 2013] [error] [client 79.103.41.173]   File "files.py", line 73, in <module>
[Thu Jun 06 13:27:17 2013] [error] [client 79.103.41.173]     cur.execute('''SELECT url FROM files WHERE url = %s''', fullpath.decode('iso-8859-7') )
[Thu Jun 06 13:27:17 2013] [error] [client 79.103.41.173] AttributeError: 'str' object has no attribute 'decode'

Same when i encode in latin



More information about the Python-list mailing list