[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

"Martin v. Löwis" martin at v.loewis.de
Sat Apr 25 14:42:37 CEST 2009

> Following on from that, would this (under Martin's proposal) result in
> programs receiving encoded strings, or just semantically-incorrect
> ones?

Not sure I understand the question - what is an "encoded string"?

As you analyse below, sometimes, the current (2.x) file system encoding
will do the right thing; sometimes, it will decode successfully, but
still not give the intended string, and sometimes, it will fail. With
the PEP, it won't fail, but give a string back that likely wasn't
intended by the user. This might be confusing if you try to render it to
a user interface; if the application merely passes it back to file
system APIs, it will work fine.

> So, the next question is - do people on such systems frequently use
> high-bit characters in filenames?

They typically do until they run into problems. For example, if they
set the locale to something, and then create files in their
homedirectory, it will work just fine, and nobody else will ever see
the files (except for the backup software).

When they find that the files they created are inaccessible to others,
they will often stop using funny characters.


More information about the Python-Dev mailing list