[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?
James Y Knight
foom at fuhm.net
Tue Sep 30 01:33:47 CEST 2008
On Sep 29, 2008, at 7:23 PM, Adam Olsen wrote:
> An ugly hack, but more correct than UTF-8b or any similar attempt to
> do "unicode but not quite unicode"; either it's lossy, or it's not
> unicode. There's no in between.
Promoting the use of 8859-1 to decode mostly-utf-8 data seems like a
very poor way forward. I don't see how you can claim it's more
correct. It's correct in no case except for pure ASCII on a utf-8
system.
I still like the UTF-8b proposal, but if you want to push against
that, I don't see any sensible alternative but to move back towards a
bytestring API. Having two parallel APIs or a mixture of data types is
confusing, so, just toss the Unicode APIs entirely. That'd be much
much nicer than having everyone use 8859-1, incorrectly, for their
platform encoding.
On Windows, the platform-native Unicode strings could simply be
encoded into utf-8 when entering Python-land, and decoded back to
Unicode when leaving pythonland, to keep the API consistently
bytestring oriented on both platforms.
James
More information about the Python-3000
mailing list