[Python-Dev] Filename as byte string in python 2.6 or 3.0?
glyph at divmod.com
glyph at divmod.com
Tue Sep 30 20:12:31 CEST 2008
On 02:39 pm, hrvoje.niksic at avl.com wrote:
>For example, implementing os.listdir to return the file names as
>Unicode
>subclasses with ability to access the underlying bytes (automatically
>recognized by open and friends) sounds like a good compromise that
>allows the word processor to both have the cake and eat it.
It really seems like the strategy of the current patch (which I believe
Guido proposed) makes the most sense. Programs pass different arguments
for different things:
listdir(text) -> I am thinking in unicode and I do not know about
encodings, please give me only things that are proper unicode, because I
don't want to deal with that.
listdir(bytes) -> I am thinking about bytes, I know about encodings.
Just give me filenames as bytes and I will decode them myself or do
other fancy things.
You can argue about whether this should really be 'listdiru' or 'globu'
for explicitness, but when such a simple strategy with unambiguous types
works, there's no reason to introduce some weird hybrid bytes/text type
that will inevitably be a bug attractor.
Python's path abstractions have never been particularly high level, nor
do I think they necessarily should be - at least, not until there's some
community consensus about what a "high level path abstraction" really
looks like. We're still wrestling with it in Twisted, and I can think
of at least three ways that ours is wrong. And ours is the one that's
doing the best, as far as I can tell :).
This proposal gives higher level software the information that it needs
to construct appropriate paths.
The one thing it doesn't do is expose the decoding rules for the higher-
level applications to deal with. I am pretty sure I don't understand
how the interaction between filesystem encoding and user locale works in
that case, though, so I can't immediately recommend a way to do it.
More information about the Python-Dev
mailing list