[Python-Dev] Re: adding a bytes sequence type to Python
James Y Knight
foom at fuhm.net
Wed Aug 18 00:04:45 CEST 2004
On Aug 17, 2004, at 5:18 PM, Bob Ippolito wrote:
> On Aug 17, 2004, at 5:11 PM, Martin v. Löwis wrote:
>
>> Bob Ippolito wrote:
>>> How would you embed raw bytes if the string was unicode?
>>
>> The most direct notation would be
>>
>> bytes("delimited packet\x00")
>>
>> However, people might not understand what is happening, and
>> Guido doesn't like it if the bytes are >127.
>
> I guess that was a bad example, what if the delimiter was \xff?
Indeed, if all strings are unicode, the question becomes: what encoding
does bytes() use to translate unicode characters to bytes. Two
alternatives have been proposed so far:
1) ASCII (translate chars as their codepoint if < 128, else error)
2) ISO-8859-1 (translate chars as their codepoint if < 256, else error)
I think I'd choose #2, myself.
> I know that map(ord, u'delimited packet\xff') would get correct
> results.. but I don't think I like that either.
Why would you consider that wrong? ord(u'\xff') *should* return 255.
Just as ord(u'\u1000') returns 4096. There's nothing mysterious there.
James
More information about the Python-Dev
mailing list