[Python-ideas] RFC: bytestring as a str representation [was: a new bytestring type?]
Ethan Furman
ethan at stoneleaf.us
Wed Jan 8 02:19:38 CET 2014
On 01/07/2014 04:39 PM, Steven D'Aprano wrote:
> On Tue, Jan 07, 2014 at 08:48:05AM -0800, Ethan Furman wrote:
>
>> [...] My binary stream is mixed:
>>
>> - binary that has to be converted (4-byte ints, for example)
>> - ascii that has to be converted (ints stored as ascii text)
>> - encoded text (character and memo fields)
>
> Ethan, you keep referring to ascii text and encoded text as if they are
> different things. They're not.
Would you feel better if I called them ASCII-encoded text, and other-encoded text? And they are different, if for no
other reason than they are using different encodings. Further, the ASCII-encoded text can be directly compared with
byte sequences because . . . they're bytes! ;)
> You have a binary file containing bytes.
> Some of those bytes represent data of one kind (say, 4-bit ints). Some
> of those bytes represent data of a different kind (Latin-1 encoded text
> representing character and memo fields) and other bytes represent data
> of a third kind (ASCII encoded text representing ints, but you don't
> mention what the meaning of those ints is).
ASCII-encoded text reprenting ints are ints. I don't know what they mean, but presumably they have something to do with
whatever the user named the field. For example, I would imagine that b'35' in an AGE field meant 35 years; luckily I
only have to give the user back the integer 35, not figure out what it's supposed to mean.
> ASCII or Latin-1, the text is still encoded into bytes, and still needs
> to be decoded back to text.
No, it doesn't. I don't need to convert b'35' into u'35' to convert to 35. I don't need to convert b'N' to u'N' to
know I have a Numeric field, nor b'T' to u'T' to get True.
--
~Ethan~
More information about the Python-ideas
mailing list