[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

"Martin v. Löwis" martin at v.loewis.de
Tue Sep 30 02:07:23 CEST 2008


>   import os
>   import os.path
>   import sys
>   if os.path.supports_unicode_filenames:
>      cwd = getcwd()
>   else:
>      cwd = getcwdb()
>      encoding = sys.getfilesystemencoding()
>   for filename in os.listdir(cwd):
>      if os.path.supports_unicode_filenames:
>         text = str(filename, encoding, "replace)
>      else:
>         text = filename
>      print("=== File {0} ===".format(text))
>      for line in open(filename):
>         ...
> 
> We need an "if" to choose the directory. The second "if" is only needed to 
> display the filename. Using bytes, it would be possible to write better code 
> detect the real charset (eg. ISO-8859-1 in a UTF-8 file system) and so 
> display correctly the filename and/or propose to rename the file. Would it 
> possible using UTF-8b / PUA hacks?

Not sure what "it" is: to write the code above using the PUA hack:

for filename in os.listdir(os.getcwd())
    text = repr(filename)
    print("=== File {0} ===".format(text))
    for line in open(filenmae):
        ...

If "it" is "display the filename": sure, see above. If "it" is "detect
the real charset": sure, why not?

Regards,
Martin


More information about the Python-Dev mailing list