Python 3 encoding question: Read a filename from stdin, subsequently open that filename
Peter Otten
__peter__ at web.de
Wed Dec 1 04:34:24 EST 2010
Nobody wrote:
> Python 3.x's decision to treat filenames (and environment variables) as
> text even on Unix is, in short, a bug. One which, IMNSHO, will mean that
> Python 2.x is still around when Python 4 is released.
For filenames in Python 3 the user has the choice between "text" (str) and
bytes. If the user chooses text that will be converted to bytes using a
default encoding that hopefully matches that of the other tools on the
machine that manipulate filenames.
I see that you may run into problems with the text approach when you
encounter byte sequences that are illegal in the chosen encoding.
I therefore expect that lowlevel tools will use bytes to manipulate
filenames while end user scripts will choose text.
I don't see how a dogmatic bytes only restriction can improve the situation.
Also, you can already provide unicode filenames in Python 2.x (and a script
containing constant filenames becomes more portable if you do), so IMHO the
situation in Python 2 and 3 is similar enough as to not hinder adoption of
3.x.
Peter
More information about the Python-list
mailing list