[Python-Dev] a suggestion ... Re: PEP 383 (again)

Stephen J. Turnbull stephen at xemacs.org
Wed Apr 29 14:18:42 CEST 2009

Thomas Breuel writes:

 > PEP 383 violated (2), and I think that's a bad thing.

The whole purpose of PEP 383 is to send the exact same bytes that were
read from the OS back to the OS => violating (2) (for whatever the
apparent system file-encoding is, not limited to UTF-8), and that has
overwhelmingly popular support.

Note that this won't happen automatically, either, AIUI.  The PEP's
proposed implementation is as an error handler, and this would need to
be specified explicitly.  It's not intended to be the default.

 > I think the best solution would be to use (3a) and fall back to (3b) if that
 > doesn't work.  If people try to write those strings, they will always get
 > written as correctly encoded UTF-8 strings.

The intended audience aren't trying to write anything in particular,
though.  They just want to repeat verbatim what the OS told them.

 > There is yet another option, which is arguably the "right" one: make the
 > results of os.listdir() subclasses of string that keep track of where they
 > came from.

Sure.  This has been mentioned by several people.  Martin has no
intention of doing it in PEP 383, though, so it will need a new PEP.
It has gotten strong pushback from several people, as well.

More information about the Python-Dev mailing list