Re: [Python-ideas] Python 3.x and bytes

May 20, 2011

      On 20 May 2011 14:05, Ethan Furman <ethan@stoneleaf.us> wrote:
...
Terry Reedy wrote:
...
As far as I noticed, Ethan did not explain why he was extracting single
bytes and comparing to a constant, so it is hard to know if he was even
using them properly.
The header of a .dbf file details the field composition such as name, size,
type, etc.  The type is C for character, L for logical, etc, and the end of
the field definition block is signaled by a CR byte.
So in one spot of my code I (used to) have a comparison
if hdr[0] == b'\x0d': # end of fields
which I have changed to
if hdr[0] == 0x0d:
This seems to me to be an improvement, regardless...
...
and elsewhere:
field_type = hdr[11]
which is now
field_type = chr(hdr[11])
since the first 127 positions of unicode are ASCII.
That seems reasonable, if you have a fixed set of known-ASCII values
that are field types. If you care about detecting invalid files, then
do a field_type in 'CL...' test to validate and you're fine.
...
However, I can see this silently producing errors for values between 128 and
255 -- consider:
--> chr(0xa1)
'¡'
--> b'\xa1'.decode('cp1251')
'\u040e'
But those aren't valid field codes, so why do you care? And why are
you using cp1251? I thought you said they were ASCII? As I said, if
you're checking for error values, just start with either a check for
specific values, or simply check the field type is <128.
...
So because my single element access to the byte string lost its bytes type,
I may no longer get the correct result.
I still don't see your problem here...

Paul.

Re: [Python-ideas] Python 3.x and bytes

Paul Moore