On 2008-09-30 16:05, Guido van Rossum wrote:
On Tue, Sep 30, 2008 at 3:31 AM, M.-A. Lemburg <mal@egenix.com> wrote:
Change the default file system encoding to store bytes in Unicode is like introducing a new Python type: <fake Unicode for filename hacks>. Exactly. Seems like the best solution to me, despite your polemics. Not a bad idea... have os.listdir() return Unicode subclasses that work
On 2008-09-30 08:00, Martin v. Löwis wrote: like file handles, ie. they have an extra buffer that holds the original bytes value received from the underlying C API.
Passing these handles to open() would then do the right thing by using whatever os.listdir() got back from the file system to open the file, while still providing a sane way to display the filename, e.g. using question marks for the invalid characters.
The only problem with this approach is concatenation of such handles to form pathnames, but then perhaps those concatenations could just work on the bytes value as well (I don't know of any OS that uses non- ASCII path separators).
While this seems to work superficially I expect an infinite number of problems caused by code that doesn't understand this subclass. You are hinting at this in your last paragraph.
Well, to some extent Unicode objects themselves already implement such a strategy: the default encoded bytes object basically provides the low-level interfacing value. But I agree, the approach is not foolproof. In the end, I think it's better not to be clever and just return the filenames that cannot be decoded as bytes objects in os.listdir(). Passing those to open() will then open the files as expected, in most other cases the application will have to provide explicit conversions in whatever way best fits the application. Also note that os.listdir() isn't the only source of filesnames. You often read them from a file, a database, some socket, etc, so letting the application decide what to do is not asking too much, IMHO. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Sep 30 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611