[Python-ideas] RFC: bytestring as a str representation [was: a new bytestring type?]
MRAB
python at mrabarnett.plus.com
Tue Jan 7 20:32:26 CET 2014
On 2014-01-07 18:38, Ethan Furman wrote:
> On 01/07/2014 10:22 AM, MRAB wrote:
>> On 2014-01-07 17:46, Andrew Barnert wrote:
>>> On Jan 7, 2014, at 7:44, Steven D'Aprano <steve at pearwood.info> wrote:
>>>
>> I was thinking about Ethan's suggestion of introducing a new bytestring
>> class and a lot of these suggestions are what I thought the bytestring
>> class could do.
>
>>>>
>>>> Suppose we take a pure-ASCII byte-string and decode it:
>>>>
>>>> b'abcd'.decode('ascii-compatible')
>>>>
>> That would be:
>>
>> bytestring(b'abcd')
>>
>> or even:
>>
>> bytestring('abcd')
>>
>> [snip]
>>>
>>>> Suppose we take a byte-string with a non-ASCII byte:
>>>>
>>>> b'abc\xFF'.decode('ascii-compatible')
>>>>
>> That would be:
>>
>> bytestring(b'abc\xFF')
>>
>> Bytes outside the ASCII range would be mapped to Unicode low
>> surrogates:
>>
>> bytestring(b'abc\xFF') == bytestring('abc\uDCFF')
>
> Not sure what you mean here. The resulting bytes should be 'abc\xFF' and of length 4.
>
'abc\xFF' is a Unicode string, but you wouldn't be able to convert it
to a bytestring because '\xFF' is a codepoint outside the ASCII range
and not a low surrogate.
More information about the Python-ideas
mailing list