[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Aahz aahz at pythoncraft.com
Fri Apr 24 17:27:46 CEST 2009

On Fri, Apr 24, 2009, Paul Moore wrote:
> 2009/4/24 Simon Cross <hodgestar+pythondev at gmail.com>:
>> Humour aside :), the expectation that filenames are Unicode data
>> simply doesn't agree with the reality of POSIX file systems.
> However, it *does* agree with the reality of Windows file systems. The
> fundamental problem here is that there is a strong OS disparity - for
> Windows, the OS uses Unicode, for POSIX, the OS uses bytes.
> Traditionally, Python has been happy to expose OS differences, and let
> application code address platform portability issues. But this is such
> a fundamental area, that doing so is problematic - it could easily
> result in *more* code being OS-specific (in subtle,
> only-affects-non-Latin-alphabet-using-users manners) rather than less.

The part that I haven't seen clearly addressed so far is what happens
when disks get mounted across OSes (e.g. NFS).

While I agree that there should be a layer on top that can handle "most"
situations, it also seems clear that the raw layer needs to be readily
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur."  --Red Adair

More information about the Python-Dev mailing list