[Python-3000] [Python-Dev] Filename as byte string in python 2.6 or 3.0?
James Y Knight
foom at fuhm.net
Wed Oct 1 20:30:29 CEST 2008
BTW, Windows will cheerfully let you create and access files with
"garbage surrogates" in it.
Try it yourself:
open(u"\ud8fd", 'w').close()
os.listdir(u'.')
IMO that pretty much blows out of the water any suggestion encoding
invalid UTF-8 sequences into lone surrogates is an evil and broken
thing to do.
So, I'm back to favoring the lone surrogate plan over the U+0000 plan.
But either one seems better than the alternatives.
James
On Sep 29, 2008, at 11:11 PM, Stephen J. Turnbull wrote:
> James Y Knight writes:
>> On Sep 29, 2008, at 3:32 AM, Adam Olsen wrote:
>
>>> UTF-8b doesn't work as intended. It produces an invalid unicode
>>> object (garbage surrogates) that cannot be used with external APIs
>>> or
>>> libraries that require unicode.
>>
>> I'd be interested to hear more detail on what you expect the
>> practical
>> ramifications of this to be. It doesn't sound likely to be a problem
>> to me.
>
> That's because you have a specific use case in mind. Adam clearly has
> in mind passing the filename on to a library which might proceed to
> signal an error (to him, unexpected) on garbage surrogates. He
> doesn't want to be surprised by that.
More information about the Python-3000
mailing list