[Tutor] sys.getfilesystemencoding()
Albert-Jan Roskam
fomcl at yahoo.com
Tue Dec 18 14:13:58 CET 2012
Hi,
I am trying to write a file with a 'foreign' unicode name (I am aware that this is a highly western-o-centric way of putting it). In Linux, I can encode it to utf-8 and the file name is displayed correctly. In windows xp, the characters can, apparently, not be represented in this encoding called 'mbcs'. How can I write file names that are always encoded correctly on any platform? Or is this a shortcoming of Windows?
# Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32
import sys
def _encodeFileName(fn):
"""Helper function to encode unicode file names into system file names.
http://effbot.org/pyref/sys.getfilesystemencoding.htm"""
isWindows = sys.platform.startswith("win")
isUnicode = isinstance(fn, unicode)
if isUnicode: # and not isWindows
encoding = sys.getfilesystemencoding() # 'mbcs' on Windows, 'utf-8' on Linux
encoding = "utf-8" if not encoding else encoding
return fn.encode(encoding)
return fn
fn = u'\u0c0f\u0c2e\u0c02\u0c21\u0c40' + '.txt' # Telugu language
with open(_encodeFileName(fn), "wb") as w:
w.write("yaay!\n") # the characters of the FILE NAME can not be represented in the encoding (squares/tofu)
print "written: ", w.name
Thank you very much in advance!
Regards,
Albert-Jan
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a
fresh water system, and public health, what have the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Tutor
mailing list