[Python-Dev] Python 3.x and bytes
glyph at twistedmatrix.com
Thu May 19 20:22:20 CEST 2011
On May 19, 2011, at 1:43 PM, Guido van Rossum wrote:
> -1; the result is not a *character* but an integer.
Well, really the result ought to be an octet, but I suppose adding an 'octet' type is beyond the scope of even this sprawling discussion :).
> I'm personally favoring using b'a' and possibly hiding this in a constant definition.
As someone who spends a frankly unfortunate amount of time handling protocols where things like this are necessary, I agree with this recommendation. In protocols where one needs to compare network data with one-byte type identifiers or packet prefixes, more (documented) constants and less inscrutable junk like
if p == 'c':
elif p == 'j':
elif p == 'J': # for compatibility
would definitely be a good thing. Of course, I realize that this sort of programmer will most likely replace those constants with 99, 106, 74 than take a moment to document what they mean, but at least they'll have to pause for a moment and realize that they have now lost _all_ mnemonics...
In fact, I feel like I would want to push in the opposite direction: don't treat one-byte bytes slices less like integers; I wish I could more easily treat n-byte sequences _more_ like integers! :). More protocols have 2-byte or 4-byte network-endian packed integers embedded in them than have individual tag bytes that I want to examine. For the typical ASCII-ish protocol where you want to look at command names and CRLF-separated messages, you'd never want to look at an individual octet, stringish operations like split() will give you what you want.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev