[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?

Stephen J. Turnbull stephen at xemacs.org
Fri Oct 10 08:55:56 CEST 2008


Glenn Linderman writes:

 > Define a conforming process.

For present purposes, one that promises not to emit invalid Unicode
strings as Unicode.

 > If it is one that handles Unicode with full validation, all is 
 > wonderful, except on platforms that permit non-validated Unicode names 
 > or non-Unicode names.  And these are precisely the platforms for which 
 > these various translation schemes have been proposed.

Those aren't the proposals I've been reading about.  True, people have
suggested limiting the translation schemes with various coverage for
different platforms.  But AFAIK, all platforms supported by Python
allow NFS mounts, not to mention FAT filesystems on removable devices,
so in practice all may encounter arbitrary filenames in arbitrary
encodings.  Nor is it trivial for Python to figure out what
filesystems, let alone encodings, are being used.  So Python has to
support whatever is decided, period, perhaps with more or less complex
heuristics to tune treatment to platforms.

 > And so they will not enforce full validation on file names, even if they 
 > handle full validation on other strings.

Well, in practice that means conforming processes *will* validate at
least some file names, since I don't know of any systems that really
treat file names as anything but strings.

 > And Python will not always be the culprit.

But if the defaults get screwed up here, it will remain one of the
"usual suspects" for a long time to come.  It would be nice to provide
a foundation for doing better than that, but nothing proposed so far
does.  That's not surprising, because they're designed to preserve,
rather than handle, apparently invalid data, in hopes that somebody
else will clean up the mess.

The problem that all the proposals face is that they assume that we
know where the cleaning up will be done, and that we're in control of
the code that will have to do it.


More information about the Python-3000 mailing list