M.-A. Lemburg <mal@egenix.com> wrote:
On 2008-10-01 09:54, Ulrich Eckhardt wrote:
On Tuesday 30 September 2008, M.-A. Lemburg wrote:
Change the default file system encoding to store bytes in Unicode is like introducing a new Python type: <fake Unicode for filename hacks>. Exactly. Seems like the best solution to me, despite your polemics. Not a bad idea... have os.listdir() return Unicode subclasses that work
On 2008-09-30 08:00, Martin v. Löwis wrote: like file handles, ie. they have an extra buffer that holds the original bytes value received from the underlying C API.
Why does it have to be a Unicode subclass? In my eyes, a Unicode object promises a few things, in particular that it contains a Unicode string. If it now suddenly contains bytes without any further meaning, that would be bad.
Please read my entire email. I was proposing to store the underlying non-decodeable byte string value in such a subclass. The Unicode value of the object would then be that underlying value decoded as e.g. Latin-1 in order to be able to work on it as text.
I'm actually sort of liking this idea. A Pathname class, for convenience a subtype of String, but containing the underlying binary representation used by the OS. Even non-unicode pathnames could be represented. Bill