[Python-3000] New proposition for Python3 bytes filename issue
g.brandl at gmx.net
Mon Sep 29 18:45:28 CEST 2008
Victor Stinner schrieb:
> POSIX OS
> The default behaviour should be to use unicode and raise an error if
> conversion to unicode fails. It should also be possible to use bytes using
> bytes arguments and optional arguments (for getcwd).
> - listdir(unicode) -> unicode and raise an error on invalid filename
> - listdir(bytes) -> bytes
> - getcwd() -> unicode
> - getcwd(bytes=True) -> bytes
> - open(): accept bytes or unicode
> os.path.*() should accept operations on bytes filenames, but maybe not on
> bytes+unicode arguments. os.path.join('directory', b'filename'): raise an
> error (or use *implicit* conversion to bytes)?
This approach (changing all path-handling functions to accept either bytes
or string, but not both) is doomed in my eyes. First, there are lots of them,
second, they are not only in os.path but in many modules and also in user
code, and third, I see no clean way of implementing them in the specified way.
(Just try to do it with os.path.join as an example; I couldn't find the
good way to write it, only the bad and the ugly...)
If I had to choose, I'd still argue for the modified UTF-8 as filesystem
encoding (if it were UTF-8 otherwise), despite possible surprises when a
such-encoded filename escapes from Python.
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.
More information about the Python-3000