[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

Guido van Rossum guido at python.org
Tue Sep 30 16:05:58 CEST 2008


On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-09-30 08:00, Martin v. Löwis wrote:
>>> Change the default file system encoding to store bytes in Unicode is like
>>> introducing a new Python type: <fake Unicode for filename hacks>.
>>
>> Exactly. Seems like the best solution to me, despite your polemics.
>
> Not a bad idea... have os.listdir() return Unicode subclasses that work
> like file handles, ie. they have an extra buffer that holds the original
> bytes value received from the underlying C API.
>
> Passing these handles to open() would then do the right thing by using
> whatever os.listdir() got back from the file system to open the file,
> while still providing a sane way to display the filename, e.g. using
> question marks for the invalid characters.
>
> The only problem with this approach is concatenation of such handles
> to form pathnames, but then perhaps those concatenations could just
> work on the bytes value as well (I don't know of any OS that uses non-
> ASCII path separators).

While this seems to work superficially I expect an infinite number of
problems caused by code that doesn't understand this subclass. You are
hinting at this in your last paragraph.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-Dev mailing list