[Python-ideas] Python 3.x and bytes

Fri May 20 15:14:41 CEST 2011

On Fri, May 20, 2011 at 9:05 AM, Ethan Furman <ethan at stoneleaf.us> wrote:
> Terry Reedy wrote:
>>
>> As far as I noticed, Ethan did not explain why he was extracting single
>> bytes and comparing to a constant, so it is hard to know if he was even
>> using them properly.
>
> The header of a .dbf file details the field composition such as name, size,
> type, etc.  The type is C for character, L for logical, etc, and the end of
> the field definition block is signaled by a CR byte.
>
> So in one spot of my code I (used to) have a comparison
>
> if hdr[0] == b'\x0d': # end of fields
>
> which I have changed to
>
> if hdr[0] == 0x0d:
>
> and elsewhere:
>
> field_type = hdr[11]
>
> which is now
>
> field_type = chr(hdr[11])
>
> since the first 127 positions of unicode are ASCII.
>
> However, I can see this silently producing errors for values between 128 and
> 255 -- consider:
>
> --> chr(0xa1)
> '¡'
> --> b'\xa1'.decode('cp1251')
> '\u040e'
>
> So because my single element access to the byte string lost its bytes type,
> I may no longer get the correct result.

Can you use a single element stride as a workaround?

>>> b'01234'
b'01234'
>>> b'01234'[0]
48
>>> b'01234'[0:1]
b'0'