[Python-Dev] Python 3.x and bytes

Glyph Lefkowitz glyph at twistedmatrix.com
Thu May 19 20:22:20 CEST 2011


On May 19, 2011, at 1:43 PM, Guido van Rossum wrote:

> -1; the result is not a *character* but an integer.

Well, really the result ought to be an octet, but I suppose adding an 'octet' type is beyond the scope of even this sprawling discussion :).

> I'm personally favoring using b'a'[0] and possibly hiding this in a constant definition.

As someone who spends a frankly unfortunate amount of time handling protocols where things like this are necessary, I agree with this recommendation.  In protocols where one needs to compare network data with one-byte type identifiers or packet prefixes, more (documented) constants and less inscrutable junk like

if p == 'c':
   ...
elif p == 'j':
   ...
elif p == 'J': # for compatibility
   ...

would definitely be a good thing.  Of course, I realize that this sort of programmer will most likely replace those constants with 99, 106, 74 than take a moment to document what they mean, but at least they'll have to pause for a moment and realize that they have now lost _all_ mnemonics...

In fact, I feel like I would want to push in the opposite direction: don't treat one-byte bytes slices less like integers; I wish I could more easily treat n-byte sequences _more_ like integers! :).  More protocols have 2-byte or 4-byte network-endian packed integers embedded in them than have individual tag bytes that I want to examine.  For the typical ASCII-ish protocol where you want to look at command names and CRLF-separated messages, you'd never want to look at an individual octet, stringish operations like split() will give you what you want.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20110519/16c4fe14/attachment.html>


More information about the Python-Dev mailing list