
Le vendredi 07 mai 2010 13:24:18, Antoine Pitrou a écrit :
UTF-8 is not a good choice for the fallback because it's incompatible with other encodings like Latin1. I would like to fallback to ASCII on error which is compatible with all encodings (thanks to surrogateescape).
What do you mean with "compatible with all encodings thanks to surrogateescape"?
"àéè".encode("ascii", "surrogateescape") ... UnicodeEncodeError: 'ascii' codec can't encode characters ...
ascii+surrogatescape can *decode* anything:
b"a\xc3\xff".decode('ascii', 'surrogateescape') 'a\udcc3\udcff'
Encode with ascii+surrogatescape raise an UnicodeEncodeError for non-ASCII (except for surrogates). I think it's better to raise an error than creating utf8 filenames on a latin1 file system. -- I forgot to mention Marc Lemburg propositing of creating a PYTHONFSENCODING environment variable: #8622. -- Victor Stinner http://www.haypocalc.com/