[Python-Dev] Bytes path support

Nick Coghlan ncoghlan at gmail.com
Thu Aug 21 01:26:51 CEST 2014

On 21 Aug 2014 09:06, "Chris Barker" <chris.barker at noaa.gov> wrote:

> As I understand it, the whole problem with some posix systems is that
there is NO filesystem encoding -- i.e. you can't know for sure what
encoding a filename is in. So you need to be able to pass the bytes through
as they are.
> (At least as I read Armin Ronacher's blog)

Armin lets his astonishment at the idea we'd expect Linux vendors to fix
their broken OS get the better of him at times - he thinks the
responsibility lies entirely with us to work around its quirks and
limitations :)

The "surrogateescape" codec is our main answer to the unreliability of the
POSIX encoding model - fsdecode will squirrel away arbitrary bytes in the
private use area, and then fsencode will restore them again later. That
works for the simple round tripping case, but we currently lack good
default tools for "cleaning" strings that may contain surrogates (or even
scanning a string to see if surrogates are present).

One idea I had along those lines is a surrogatereplace error handler (
http://bugs.python.org/issue22016) that emitted an ASCII question mark for
each smuggled byte, rather than propagating the encoding problem.


> -Chris
> --
> Christopher Barker, Ph.D.
> Oceanographer
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> Chris.Barker at noaa.gov
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140821/742d453c/attachment-0001.html>

More information about the Python-Dev mailing list