[Python-Dev] PEP 383 and GUI libraries
James Y Knight
foom at fuhm.net
Sat May 2 04:12:15 CEST 2009
On May 1, 2009, at 9:42 PM, Zooko O'Whielacronx wrote:
> Yep, I reversed the order of encode() and decode(). However, my whole
> statement was utterly wrong and shows that I still didn't fully get it
> yet. I have flip-flopped again and currently think that PEP 383 is
> useless for this use case and that my original plan [1] is still the
> way to go. Please let me know if you spot a flaw in my plan or a
> ridiculousity in my requirements, or if you see a way that PEP 383 can
> help me.
If I were designing a new system such as this, I'd probably just go
for utf8b *always*. That is, set the filesystem encoding to utf-8b.
The end. All files always keep the same bytes transferring between
unix systems. Thus, for the 99% of the world that uses either windows
or a utf-8 locale, they get useful filenames inside tahoe. The other
1% of the world that uses something like latin-1, EUC_JP, etc. on
their local system sees mojibake filenames in tahoe, but will see the
same filename that they put in when they take it back out.
Gnome already uses only utf-8 for filename displays for a few years
now, for example, so this isn't exactly an unheard-of position to
take...
But if you don't do that, then, I still don't see what purpose your
requirements serve. If I have two systems: one with a UTF-8 locale,
and one with a Latin-1 locale, why should transmitting filenames from
system 1 to system 2 through tahoe preserve the raw bytes, but doing
the reverse *not* preserve the raw bytes? (all byte-sequences are
valid in latin-1, remember, so they'll all decode into unicode without
error, and then be reencoded in utf-8...). This seems rather a useless
behavior to me.
James
More information about the Python-Dev
mailing list